### G Ö T T I N G E R S T U D I E N Z U R E N T W I C K LU N G S Ö KO N O M I K / GÖTTINGEN STUDIES IN DEVELOPMENT ECONOMICS

Thomas Otter

# **Poverty, Income Growth and Inequality in Paraguay During the 1990s**

Spatial Aspects, Growth Determinants and Inequality Decomposition

Thomas Otter

# **Poverty, Income Growth and Inequality in Paraguay During the 1990s**

The Paraguayan economy did not suffer debt crises in the eighties and had significant growth rates in the second half on the seventies, but poverty remained a problem. Understanding the performance and spatial distribution of poverty and inequality over a period of more than ten years can shed new light on structural causes behind what seems to be a low growth – high poverty – high inequality trap in Paraguay. How did poverty and inequality change during the 1990s. Did inequality reduce income growth? What were the growth determinants and what are the main forces driving inequality changes? These are the questions being answered in this book.

Thomas Otter is a researcher associated to the Ibero-America Institute for Economic Research of the University of Göttingen (Germany). He holds a doctorate in economics from the same university. The author has worked as a consultant for different development agencies in Latin America, Africa, and Asia. His research interests include pro-poor growth, inequality, and human development.

Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM

via free access

Poverty, Income Growth and Inequality in Paraguay During the 1990s

# **Gottinger Studien zur Entwicklungsokonomik Gottingen Studies in Development Economics**

Herausgegeben von/ Edited by Hermann Sautter und/and Stephan Klasen

Bd./Vol. 23

Thomas Otter

# Poverty, Income Growth and Inequality in Paraguay During the 1990s

Spatial Aspects, Growth Determinants and Inequality Decomposition

### **Bibliographic Information published by the Deutsche Nationalblbliothek**

The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data is available in the internet at <http://www.d-nb.de>.

Open Access: The online version of this publication is published on www.peterlang.com and www.econstor.eu under the international Creative Commons License CC-BY 4.0. Learn more on how you can use and share this work: http://creativecommons. org/licenses/by/4.0.

This book is available Open Access thanks to the kind support of ZBW – Leibniz-Informationszentrum Wirtschaft.

Zugl.: Gottingen, Univ., Diss., 2007

Gratefully acknowledging the support of the lbero-Amerika-lnstitut tor Wirtschaftsforschung, Gottingen.

> ISBN 978-3-631-75367-5 (eBook) Cover illustration by Rolf Schinke 07 ISSN 1439-3395 ISBN 978-3-631-57201-6

© Peter Lang GmbH lnternationaler Verlag der Wissenschaften Frankfurt am Main 2008 All rights reserved.

All parts of this publication are protected by copyright. Any utilisation outside the strict limits of the copyright law, without the permission of the publisher, is forbidden and liable to prosecution. This applies in particular to reproductions, translations, microfilming, and storage and processing in electronic retrieval systems.

Printed in Germany 1 2 3 4 5 7

### www.peterlang.de

Reducing inequality and poverty at an individual level is not only a problem of the efforts being carried out by men and women, to improve their conditions of life, but of their opportunities to do so.

# **Acknowledgements**

I am grateful to the national statistics bureau of Paraguay (Direcci6n General de Estadistica, Encuestas y Censos - DGEEC) and Pablo Sauma for their help in providing the survey and census data. I would also like to thank all my classmates and colleagues at the University of Gottingen for their numerous comments and suggestions that helped to improve this work considerably. I am particularly indebted to Peter Lanjouw and Jesko Hentschel from the World Bank. My very special thanks to Prof. Stephan Klasen and Prof. Michael Grimm, for their excellent supervision, patience and permanent support and encouragement and for the opportunity Prof. Stephan Klasen gave me, accepting me as one of his students.

Asuncion, March 2007 Thomas Otter

# **Contents**


### Contents


# **List of Tables**


via free access


# **List of Figures**


### **Acronyms**


# **Preface**

During the nineteen seventies, Latin America was a prorrusmg region and seemed to have prosperous development perspectives. However, the nineteen eighties went by as the "lost decade" for most countries, mainly due to external debt crises. Although overall development perspectives were reduced during this period, the region harbored good economic performers, such as Chile, or even within some countries more and less prosperous regions remained, such as the south of Brazil compared to the northern region. Nevertheless, this better performance did not benefit large parts of the population. Until today poverty remains a non-resolved problem, even in Chile and Brazil. Paraguayan economic history is similar and at the same time different to this general trend. Even if there were no debt crises in the eighties and important growth rates in the second half of the seventies, poverty remained a non-resolved problem in the eighties and an increasing problem during the nineties. Looking back in time, prior to 2005, it would seem that Paraguay, in a way, is locked in what is known as a "low growth high poverty trap". Understanding the performance of poverty and inequality over a period of more than IO years can throw new light on structural causes behind what seems to be a "low growth high poverty high inequality trap" in Paraguay.

In recent years, there has been increasing empirical evidence worldwide that inequality levels and inequality changes are powerful determinants for poverty levels. Reducing poverty might be a tool for inequality reduction, if the effects of inequality on poverty are well understood, and vice versa.

This dissertation focuses on poverty and inequality issues in Paraguay during the nineties. In the first chapter, poverty levels and their spatial distribution are estimated. The second chapter searches for the effects of income and education inequality on growth, using the results of the first chapter as input. In chapter three, a decomposition of changes in inequality is carried out in order to better understand what the dynamics behind inequality changes are and what their impact on poverty is. The chapters are written to be read separately, consequently, some methodological repetitions were included.

Persistent poverty can be a serious impediment for growth. In the first chapter, a poverty and inequality mapping exercise shows that poverty levels and their spatial patterns were almost the same at the beginning of the 1990s and during the first few years of the 2000s. A small poverty decrease during the second half of the 1990s was not sustainable. So in a way, this is evidence that we have persistent poverty in Paraguay.

However, there may be some opportunities to reduce poverty without necessarily going through an important economic growth. In India, there are some states that are more efficient than others in reducing poverty through growth. Some of their strategies attack poverty in an attempt to reduce inequality even if there is not much economic growth. Consequently, there seems to be an interesting link between inequality and growth. To better understand which links existed between inequality and income growth during the nineties in Paraguay, chapter 2 takes the results from chapter 1 (estimated mean incomes by district) and examines the impact of initial income and education inequality on income growth. Since this exercise is based on poverty maps, this allows for the corroboration of spatial patterns and regional differences in the effects of income and education inequality on income growth.

Chapters 1 and 2 portray almost unchanged poverty levels at the beginning and at the end of a ten-year period, while also showing some reduction in income inequality. Questions on what drives inequality reduction during a period of almost inexistent economic growth and poverty reduction are answered halfway through chapter 3. These answers are the result of a microeconometric decomposition based on three different household surveys.

# **Chapter I**

# **Micro Level Estimation of Income**

### **Simulated welfare mapping (poverty maps) for Paraguay 1992 and 2002**

### **1.1. Introduction**

Recent theoretical and empirical advances have brought income and wealth distributions back into a prominent position in growth and development theories, and as determinants of specific socio-economic outcomes, such as health or levels of violence and related phenomenon of inequality. To improve empirical investigation, new techniques were required for the simulation of small scale welfare indicators, such as income and its related distribution. Elbers, Lanjouw and Lanjouw (2003) designed a statistical procedure to combine different types of data and take advantage of the detail in household sample surveys and the comprehensive coverage of a census. The method extends the literature on small area statistics (Ghosh and Rao (1994), Rao (1999)) by developing estimators of population parameters which are non-linear functions of the underlying variable of interest (for example per capita income) by deriving them from the full unit level distribution of that variable. The most famous output of these exercises is known as "poverty maps". The use of these poverty maps is an important poverty reduction policy implementation tool used for selecting the poorest villages in the country (or villages where the greatest number of poor people are), such as the programs Balsa Escuela in Brasil, Progreso in Mexico, Puente in Chile, Balsa Familia in Argentina, Bono de Desarrollo Humana in Ecuador or Tekopora in Paraguay; all of these conditional cash transfer programs, directly to extremely poor households. <sup>1</sup>

The first poverty map for Paraguay was built by Marcos Robles (2000) combining Population Census 1992 with household survey 1997/98 data using the methodology proposed in Hentschel et al (2000), although the Government did not start using this kind of tool until 2003. In 2003, the Government needed to update poverty maps urgently with census and survey data from 2002. The author of this paper carried out this update for the Social Ministry using Elbers et al (2003) methodology based on a 10% sample of census data (the only census sub-sample available by the end of 2003) and the 2002 household survey. The attained results were the input for the "Indice de Priorizacion de Gasto" IPG, a geographic targeting tool for household cash transfer programs. In 2004, the IPG ranking was updated by Marcos Robles and Horacio Santander with the entire census data from 2002 and 2003 household surveys. Although a number of poverty maps in Paraguay already exist, the results shown in this paper are the only ones that combine the entire

In this paper, the method of Elbers, Lanjouw and Lanjouw (2003) is applied using Paraguayan data from 1992 and 2002 producing estimates with levels of precision comparable to those of commonly used survey based welfare estimates but for populations down to less than 1,000 people living within the same village. This is an enormous improvement over survey based estimates, which are typically only consistent for areas encompassing hundreds of thousands, or even millions of households. Experience using the method in South Africa, Brazil, Panama, Madagascar, and Nicaragua suggest that the method is reliable (Alderman, *et. al.* (2002), and Elbers, Lanjouw, Lanjouw, and Leite (2004)).

### **1.2 Methodology and Data**

### **1.2.1 The Basic Methodology**

Paraguayan household surveys collect very detailed information on household characteristics, including its income level;2 however, coverage is limited and only representative at a relatively large geographical unit. Then again, Paraguayan population census has a complete coverage of all households, but collects very limited information on household characteristics and no information on income. The methodology developed by Elbers, Lanjouw and Lanjouw (2003) attempts to combine the advantage of detailed information on household characteristics obtained from a household survey with the complete coverage of a population census.

By combining the respective strengths of survey and census data, the simulatedwelfare mapping method aims to estimate welfare indicators for small administrative areas. The approach uses household survey data to estimate a model of per capita income ( or any other household or individual-level indicator of wellbeing) as a function of variables that are available in both the household survey and the population census.

The resulting parameter estimates from this estimation procedure are then used in a simulation to predict per capita income for each household in the census. Using the predicted per capita income, household level measures of poverty and inequality are then calculated and aggregated for small areas, such as districts,

2002 census with 2002 survey data, and what is more, show the only research results on poverty maps using 1992 census and 1992 survey data.

2 Poverty estimates by income is the official poverty measurement in Paraguay, carried out and updated periodically by National Statistical Office (DGEEC). Official poverty lines (caloric consumption line for extreme poverty and basic family basket for moderate poverty line) are updated by inflation for 4 different regions in the country; Asuncion, Central Urban, Remaining Urban and Rural. Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM

Does inequality harm growth? 21

sub-districts, or villages. This explains the origin of the name 'simulated-welfare mapping' for the method.

Importantly, the method allows for the calculation of standard errors for either welfare measure estimated. This feature is critical in that it offers a means of assessing the statistical reliability of the estimates as well as of comparisons across estimates for different communities.

### **1.2.2 The Income Model**

Following *Elbers et al.* (2001, 2003 ), the empirical model of household income is defined as:

$$\ln \mathcal{y}\_{\text{vh}} = E(\mathcal{y}\_{\text{vh}} \mid \mathcal{x}\_{\text{vh}}) + \mathcal{u}\_{\text{vh}} \tag{1.1}$$

where In *Yvh* is the logarithm of per capita income of household *h* in village v, *x.h* is a vector of observed characteristics of this household (including village level variables), and *u.h* is the error term. Note that we assume *u,h* is uncorrelated with *x,h.* This model is simplified by using a linear approximation to the conditional expectation *E(y,.* lx,h) and decomposing *u,h* into uncorrelated terms:

$$
\mu\_{\nu\hbar} = \eta\_{\nu} + \varepsilon\_{\nu\hbar} \tag{1.2}
$$

where *T/,* represents a village level error term common to all households within the village, and *e,h* is a household specific error term. It is further assumed that *T/,* is uncorrelated across villages and *e,h* is uncorrelated across households. With these assumptions, equation (1.1) reduces to

$$
\ln \mathcal{y}\_{\text{vh}} = \mathcal{x}\_{\text{vh}} \mathcal{\mathcal{B}} + \eta\_{\text{v}} + \mathcal{E}\_{\text{vh}}.\tag{1.3}
$$

Estimation of the parameters underlying this equation, in particular the vector of parameters *f)* and the distributional characteristics of the error terms, can be done by using standard tools from econometric analysis (Elbers *et al.,* 2003).

### **1.2.3 The Implementation Procedure**

The standard procedure to implement the simulated-welfare mapping method for creating a map of mean income by sub-national administrative unit consists of five steps:

### *Step I: Matching Variables in the Survey and the Census*

In order to obtain rigorous estimates of income levels of the households in the census, the explanatory variables selected in the income determination model have to exist and be measured in the same way in both the household survey and in the census. If the sample of the household survey was randomly selected and is nationally representative, the distribution of each explanatory variable in the household survey can be expected to be the same as its distribution in the census. <sup>3</sup>

### *Step 2: Selecting Explanatory Variables for the Income Model*

The selection of the explanatory variables in the income model starts by running a regression of log per capita income, using the survey data base, on the matched variables identified in Step 1, as well as some variables that can be created from other variables such as the square and cube of household size. In order to obtain a robust specification, variables are only selected for inclusion in the model if they contribute significantly to the explanation of per capita income. Hence variables with low statistical significance are dropped from the model.<sup>4</sup>

After a promising set of variables has been selected in this way, the regression is run again and the residuals of this regression are saved. These residuals need to be scrutinized to check if there are some outliers in the observation. If indeed there are some residual values which are far out of the range of most residual values, then these observations must be checked for coding or other errors. Ultimately, it may be necessary to delete them from the data.

The village level variables are obtained from either the population census data aggregated at the village level (for example the total population or age means of household heads in each village) or from other administrative data sources. These survey and census data can be completed with other data sources, mostly administrative data, such as the existence of public schooling (number of schools in a district) or infrastructure (kilometers of asphalt roads). These variables are then grouped into several sets such as demographic variables, village infrastructure variables, and village economic variables.

<sup>3</sup> As a matter of fact, only variables that have the same distribution in census and survey are selected for inclusion in the income prediction models.

<sup>4</sup> There are two kinds of dropped variables. First there are dummy variables whose frequencies are< 0.03 or> 0.97, to be dropped (even if most of them are expected to be insignificant since they would show low variance). This is carried out in order to make sure that the values of the variables included in the model show some variance which can influence in the variance of predicted income. Second, all other variables which are not significant in regression are dropped in order to make the models as robust as possible. Thomas Otter - 978-3-631-75367-5

The residuals of the last regression are then aggregated at the village level to calculate the mean of these residuals for each village. The variable selection is then carried out by running separate regressions of the village-level mean of residuals on each set of the village-level variables. The variables with significant tvalues are selected as the candidates for inclusion in the income model.

The feasibility of including these candidate village-level variables in the income model is tested by running regressions of village dummy variable on these variables. One regression is run for each village dummy variable. If the coefficient of a certain variable in a regression is one, it shows that there is a perfect multicollinearity between this variable and the village dummy variable. This will happen if, for example, a village has a certain infrastructure while no other villages have, or on the other hand, all villages except one have a certain infrastructure. Such variables are necessarily excluded from the model.

### *Step 3: Estimating the Income Model*

The result of step 2 is a complete specification of the income model, incorporating both household-level and village-level independent variables of the model. The next step is to test whether there is heteroscedascity in the data. This will determine the method to be employed to estimate the model. The first step to accomplish this is to estimate the model of equation (1.3) using Ordinary Least Squares (OLS) and save the residuals as variable ~"'.

Based on equation (1.2) the residuals ~"' are then decomposed into uncorrelated components as:

$$
\hat{\boldsymbol{\mu}}\_{\star\mathsf{h}} = \stackrel{\frown}{\boldsymbol{\mu}}\_{\star\mathsf{w}} + \left(\stackrel{\frown}{\boldsymbol{\mu}}\_{\star\mathsf{h}} - \stackrel{\frown}{\boldsymbol{\mu}}\_{\mathsf{w}\bullet}\right) = \stackrel{\frown}{\boldsymbol{\eta}}\_{\mathsf{w}} + \boldsymbol{e}\_{\star\mathsf{h}} \tag{1.4}
$$

To investigate the presence of heteroscedasticity in the data, a set of potential variables that best explain the variations in *e~* are used to estimate the following logistic model:

$$\ln\left[\frac{\left.{\left.{e\_{\mathsf{wh}}}}{\mathcal{A}-{e\_{\mathsf{wh}}^{2}}}\right|\right] = {z\_{\mathsf{wh}}^{\mathsf{T}}}\hat{\mathcal{Q}} + {r\_{\mathsf{wh}}}\right.\tag{1.5}$$

where we take A as being equal to 1.05 \* max ~;h}, as in Elbers *et al.,* (2003). This specification puts bounds on the predicted variance of *e!* .

In the case where homoscedasticity is rejected, a household specific variance estimator for *evh* is calculated as:

24 Methodology and data

$$\begin{aligned} \stackrel{\circ}{\sigma}\_{\varepsilon,\ast h}^{2} &= \left[ \frac{AB}{1+B} \right] + \frac{1}{2} \stackrel{\circ}{\mathrm{Var}}(r) \bigg[ \frac{AB(1-B)}{\{1+B\}^{3}} \bigg] \\ \text{where} \quad B &= \exp\left\{ z \stackrel{r}{\mathrm{ad}} \stackrel{\circ}{\mathrm{a}} \right\}. \end{aligned} \tag{1.6}$$

The income model is then re-estimated using the Generalized Least" Squares (GLS) method, employing the estimated variance-covariance matrix, r with a structure shown in (1.7), resulting from equation (1.6) and weighted by the population weight, *lvh* . The estimated parameters, *p GLS,* and their variance,

$$\mathsf{Var}\Big(\hat{\boldsymbol{\beta}}\_{ous}\Big),$$

are saved for use in the simulation.

[ var(n,)+var(e,h) var(e,h) var(e,h) var(e,h) var(n,) + var(e,h) var(e,h) var(e,h) var(e,h) var(n,) + var(e,h) var(e,h) var(e,h) var(e,h) var(e,h) var(e,h) l (1.7) var(e,h) var(n,) + var(e,h)

### *Step 4: Simulations on Census Data*

The purpose of this step is to apply the parameters estimated in the previous step to the census data. However, since the values of these parameters are obtained through estimations, they are not the precise values of these parameters and subject to sampling error. This needs to be taken into account when applying the parameters to the census data by taking into account the sampling error of the coefficient estimates. For a start, recall that the purpose is to calculate the simulated version of equation (1.3):

$$
\ln \mathcal{y}\_{\star\mathsf{h}}^{s} = \mathcal{x}\_{\star\mathsf{h}} \mathcal{B}^{s} + \eta\_{\nu}^{s} + \mathcal{e}\_{\star\mathsf{h}}^{s} \tag{1.8}
$$

where the superscripts refers to simulated version of each parameter or variable and now *x,h* refers to characteristics of the households in the population census data.

### *Simulation of /3*

The simulated value of /Jis attained through a random draw, assuming

$$
\boldsymbol{\beta} \sim \mathcal{N}\left(\hat{\boldsymbol{\beta}}\_{\boldsymbol{\alpha}\boldsymbol{\omega}}, \operatorname{var}\left(\hat{\boldsymbol{\beta}}\_{\boldsymbol{\alpha}\boldsymbol{\omega}}\right)\right).
$$

Note that the draw has to take into account the covariance across */J's.* The randomly drawn parameter is defined as */3'.* The next step is to then apply this

Does inequality harm growth? 25

simulated parameter to each household in the census data to calculate the value Of *Xvh/3',* 

### *Simulation of 1'/.*

The process of obtaining the simulated value of *T/v* requires two steps of simulations. This is because the variance of 7/ itself is estimated with error. Hence, the first step is to obtain the simulated variance of 7/, *u!'.* Elbers *et al.* (2003) propose to draw *u!'* from a gamma distribution:

$$
\sigma\_{\eta}^{\mathfrak{z}} \sim G\Big(\stackrel{\circ}{\sigma}\_{\eta}^{\mathfrak{z}}, \stackrel{\circ}{\mathrm{var}}(\sigma\_{\eta}^{\mathfrak{z}})\Big).
$$

Consequently, a random draw of the variance for the whole sample is exercised and its mean is defined as *u!'.* Then the second ~tep is to randomly draw 77; for each village in the census data, assuming *T/v* ~ *Nl0,u;'* ).

### *Simulation of & vh*

The process of obtaining the simulated value of *&vh* requires the use of the estimation results of equation ( 1.5). Assuming

a random draw of *a* is made and defined as *a'* . As in the case of */3,* the draw has to take into account the covariance across *a's.* The simulated parameter is then used to simulate the household specific variance estimator for *&vh* as defined in equation (1.6) for each household in the census data. Finally, the simulated value of household specific idiosyncratic error, *e;h,* for every hou~ehold)in the census data is obtained by taking a random draw, assuming *&vh* ~ NlO, *u;; . 5* 

### *Collecting*

Now all three components of equation (1.7) have been simulated, the value of In *y;h* for all households in the census data can be calculated by summing up the values of *xvh/3',* 77;, and *e;h* that have been obtained. The whole set of simulations is then repeated a number (150 in our case) of times, so that in the end a database of 150 simulated values of (log) per capita household income of all the households in the census data is created. This is mainly to see if there variance within these 150 simulations in this fixed effects exercise is acceptably small.

<sup>5</sup> Elbers *et al.* (2003) mention alternatives for the assumption that the error component terms follow normal distributions. In separate sets of simulations we have experimented with these alternative assumptions. In no case did this lead to significantly different results. Thomas Otter - 978-3-631-75367-5

# *Step 5: Calculation of Poverty and Inequality Indicators*

The final output of Step 4 is a database of 150 simulated values of household income of all households in the census data. This database is used as the basis for calculating point estimates and standard errors of various poverty and inequality measures at the department, district and village levels. The point estimate of each measure is the mean of the calculated measure over the 150 simulated household incomes. Meanwhile, the standard error of this estimate is equal to the standard deviation of the calculated measure over the 150 simulated household incomes. The welfare indicators of a region - at any level - are calculated directly from the data of all individual households residing in that region.<sup>6</sup>

### **1.2.4 Data Sources**

Four sources of data were used: (i) Encuesta de Condiciones de Vida (ECV) 1992 (ii) Censo Nacional de Poblaci6n y Vivienda (CNPV) 1992, (iii) Encuesta Permanente de Hogares (EPH) 2002, and (iv) CNPV 2002. Both census and EPH 2002 were carried out by the Paraguayan National Statistical Office DGEEC (Direccion General de Estadistica, Encuestas y Censos), while the 1992 household survey was carried out by National University and the Inter American Development Bank.

Both surveys are representative household surveys, covering all areas of the country, with representative results for four different regions, Asuncion, Central Urban, Remaining Urban and Rural.<sup>7</sup>In the 1992 survey, 5,059 households (22,257 individuals) were interviewed, while 3,789 households (17,600 individuals) were interviewed in 2002. In general, both surveys follow the general format of a World Bank Living Standards Measurement Survey. The population censuses of 2002 and 1992 are respectively the sixth and fifth population census

<sup>6</sup> The application of this poverty mapping exercise from step 3 to *5* is implemented using a computer program called PovMap (Version 1.2.4, February 2005), developed by Qinghua Zhao at the World Bank. All other steps were carried out using SPSS 13.0.

<sup>7</sup> The Asuncion region only includes the city of Asuncion, while Central Urban covers the urban areas of the most populated department of Paraguay, called "Departamento Central". Most of these urban areas are direct neighbors of Asuncion, together forming a kind of metropolitan region, excluding Asuncion. Remaining Urban include all other urban areas except Asuncion and Central Urban. Their common characteristic is to be urban, although not building a continuous geographic area. Rural includes all the rural areas. 1992 and 2002 household survey exclude in their sampling the departments Boqueron and Alto Paraguay in Chaco region, both remote rural areas; on the one hand due to budget constraints and, on the other, because less than 3% of the population live within these two departments. Thomas Otter - 978-3-631-75367-5

Does inequality harm growth? 27

carried out in Paraguay, both by DGEEC, in a systematic and comparable way. The previous censuses were carried out in 1950, 1962, 1972 and 1982. All censuses are carried out during the month of August, and cover the entire population living within Paraguayan territory, including foreign residents.

### **1.3 Results**

### **1.3.1 Regression Results**

In order to estimate per capita income for every household in the census, a set of household head individual characteristics, such as age, years of schooling or economic activity, characteristics of the spouse and the family group, such as number of children, their schooling or economic activity, characteristics of the habitat, access to basic services and assets within the household are used. We also tried some local infrastructure data such as kilometers of asphalt road, number public schools, the existence of a post office, public transport or government organized market places in the district.8 To reinforce empirical evidence on spatial effects we controlled the number of direct neighboring districts9 and percentage of economic activity by different sectors in neighboring districts.

As the aim of the regression models is to predict as precise as possible per capita income for every household in the census, using coefficients from regressions based on household surveys, including only common variables from survey and census, household survey regressions results need not be understood as regressions on determinants of income, but as regressions on variables which are correlated with income. Variables that are correlated with income, and not only variables which determine income, are used in order to achieve good results.

<sup>8</sup> Most of these happened to be insignificant and were excluded from the final models (see footnote 4). Only for the "Remaining Urban Area in 2002" variable did the Kilometer of asphalt roads in each district (ROAD) happen to be significant and was included in the final model (see Table 1.9).

<sup>9</sup> The number of direct neighboring districts (NVD) for Paraguayan economy can be understood as a proxy for closeness to areas of higher economic dynamics. Four out of the five most economically important cities are border cities, three of these with twin cities on the other side of the border. For being border districts the number of direct neighbor districts is smaller than for other districts within the country. The NVD variable can be a proxy to measure local effects of these more dynamic border cities. Thomas Otter - 978-3-631-75367-5

### **Table 1.1 Variable definitions for 1992 estimates**


Source: CNPV and ECV 1992

Does inequality harm growth? 29


**Table 1.2 Regression results Asuncion 1992** 


**Table 1.3 Regression results Central Urban 1992** 

Source: Author's calculations based on ECV 1992

Does inequality harm growth? 31


**Table 1.4 Regression results Remaining Urban 1992** 

Source: Author's calculations based on ECV 1992


**Table 1.5 Regression results Rural 1992** 

Source: Author's calculations based on ECV 1992

Does inequality hann growth?


**Table 1.6 Variable definitions 2002** 


**Table 1.7 Regression results Asuncion 2002** 

Source: Author's calculations based on EPH 2002



Source: Author's calculations based on EPH 2002

Does inequality harm growth? 35


**Table 1.9 Regression results Remaining Urban 2002** 

Source: Author's calculations based on EPH 2002


**Table 1.10 Regression results Rural 2002** 

Source: Author's calculations based on EPH 2002

All eight models produce acceptable results with most of the adjusted R sqrd. > 0.6; var(n,) between 0.15 and 0.06; var(var(n,)) < 0.001 and var(e,h) between 4.6 and 5.9. For 1992 a highernumber of significant variables (13 to 22) were identified. Additionally, 1992 models include between 4 and 11 cluster means variables, identifying local effects in the error term. In 2002, only 2 models include cluster means and the number of significant variables from census is much smaller. For 1992 and 2002, all coefficients from census variables have the expected signs, but not all the cluster means do. Interestingly, variables included to produce additional evidence on more extended spatial effects (passing district borders) produce the expected results. In 1992, NVD has a negative coefficient for Central Urban and percentage of tertiary sector employment in neighboring districts has Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM

a positive sign for Remaining Urban and Rural models. For 2002, the percentage of primary sector employment in neighboring districts has a negative sign for the Remaining Urban and Rural areas. Infrastructure data such as availability of public transport, education and health care institutions, post offices and public market places did not produce any significant results for income estimation. It was only for the Remaining Urban area 2002 that we found a significant and positive effect of kilometers of asphalt roads.

### **1.3.2 Poverty Estimates**

The poverty measures calculated are the poverty headcount index (PO), poverty gap index (Pl), and poverty severity index (P2) from the FGT family of poverty measures. 10 Meanwhile, the inequality measure calculated is the Gini ratio and General Entropy measures (GEO; GE 1; GE2), as deciles mean incomes, but only PO and Gini results are reported in this paper. In addition to the estimates of poverty and inequality indicators as usually presented, the results of the simulated-welfare mapping exercise also provide the standard errors of these estimates as a measure of their precision.

Tables 1.11 and 1.12 compare the estimated headcount poverty rate as reported in the household surveys in 1992 and 2002 and those estimated from the Population Census data (standard error of simulations in brackets, standard error as percentage of estimated poverty in squared brackets).


**Table 1.11 Percentage of Povertr** - **1992** 

Source: Author's calculations based on ECV 1992 and CNPV 1992. 10 Foster *et al.* (1984). Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM For 1992, extreme poverty is higher in the Asuncion, Remaining Urban and Rural simulations, even when considering standard errors. Simulated mean income is higher than that observed in the Asuncion, Central Urban and Rural regions. This seems to be a consequence of differences in the income distribution between observed incomes in the survey and simulated incomes with census data. Overall poverty is almost the same except for the Remaining Urban area. We overestimate extreme poverty and underestimate moderate poverty but have an almost exact result for overall poverty except in Remaining Urban where simulated overall poverty exceeds observed poverty by 12%. General over-estimate is 4%. Since we overestimate mean income but nevertheless get higher poverty rates, there seem to be differences in the observed income distribution in survey and the simulated income distribution in census estimates. 11 



Notes: Extreme Poverty Line 142,308 (ASU); 140,717 (CU); 106,802 (RU); 73,501 (RUR) Poverty Line: 321,229 (ASU); 317,998 (CU); 197,895 (RU); 118,483 (RUR) 

Source: Author's calculations based on EPH 2002 and CNPV 2002.

Considering standard errors, the estimates for extreme poverty for 2002 fit for all regions except the Rural **area.** In this model, simulated total poverty exceeds observed poverty in all areas by an average of about 5%. We are still overestimating mean income, although not as much as in 1992. 

To understand what could be the possible reasons for the differences between observed results in the surveys and estimated results in census database, sev-

<sup>11</sup> One possible source for these distribution differences is the fact that household observed household income in the survey "is not continuous" (this is that several of the higher centiles [>80] are empty [no observations]). Nevertheless, in simulation exercises there will be estimates for a "continuous distribution". Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM

era! tests were carried out. There is almost no difference between observed (survey) and simulated (census) covariate levels. Only in one (Central Urban 2002) out of the eight models in use, simulated census covariate levels differed from the observed ones in the survey in more than 1 %. Furthermore, there is no problem with the residual assumption in survey regressions. In all cases, residuals have a mean zero and a low variance. Consequently, if there are no structural problems in the models themselves, one possible source for biases could be a sampling problem in the actual surveys, for example classification of households in urban and rural areas. In fact, for 2002 the maximum difference for population share by region between survey and census is 0.2%. However, in 1992 there is a classification problem for the Asuncion and Rural areas (Asuncion has 11.6% of population in census and 14.9% in survey; Rural area has 48.6% in census and 45.6% in survey). The maximum difference for Central Urban and Remaining Urban is 0.5%. Recall that the 1992 survey was the first nationwide household survey ever carried out in Paraguay, so there may have been some lack of experience in paying attention to all the details. This hypothesis is consistent with differences between 1992 and 2002 simulations, where biases in 2002 are much smaller than in 1992. The way sampling problems can introduce biases in the results seems to be through the variance of household specific error (e,h)- The variance of 1992 errors exceeds that of2002 errors between 9 and 23 percent.



Source: Author's calculations based on ECV 1992, CNPV 1992, EPH 2002 and CNPV 2002.

For 1992 we underestimate inequality for Asuncion but overestimate inequality in all other regions. These differences also seem to be a consequence of allocation differences between rural and urban areas in 1992, between census and survey, for the Asuncion and Rural areas. For 2002, the estimated inequality in the Asuncion and Rural areas are slightly higher than those observed and we underestimate inequality in the Central Urban and Remaining Urban areas.

### **1.3.3 Poverty and Inequality Maps**

When examining these maps, it should be kept in mind that they have been created using the *expected* headcount. The *true* headcount for a location will differ from the expected headcount because of sampling and modeling errors. The maps do not take the errors into account.

**Figure 1.1 FGTO Per capita income 1992 at district level** 

Source: Author's calculations based on ECV 1992 and CNPV 1992

Does inequality harm growth?

### **Figure 1.2 Gini Per capita income 1992 at district level**

Source: Author's calculations based on ECV 1992 and CNPV 1992 **Figure 1.3 FGT0 Per capita income 2002 at district level** 

Source: Author's calvulations based on EPH 2002 and CNPV 2002 Downloaded from PubFactory at 01/11/2019 05:51:05AM via free access

### **Figure 1.4 Gini Per capita income 2002 at district level**

Source: Author's calculations based on EPH 2002 and CNPV 2002

### **Figure 1.5 FGT0 Per capita income Itapua department 1992 and 2002 at district level**

Source: Author's calculations based on ECV 1992, CNPV 1992, EPH 2002 and CNPV 2002.

The maps in Figure 1.5 show the heterogeneity of poverty levels in small areas, using the Itapua department example. Itapua is one of the most prosperous departments in Paraguay, considering its economic performance (important GDP growth driven by mechanized Soya agro-industry) between 1992 and 2002. Re- Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM

via free access

garding departmental mean poverty levels in 2002, Itapua ranked 11 out of 18 departments, about 1 percentage point above the national poverty rate, despite its high GDP growth. What is behind this apparent contradiction can be seen by mapping Itapua's poverty levels by district. In 1992, there was a belt of poor, up to extremely poor districts in the north of the department (darker colors), with poverty rates of up to 68%, versus a more prosperous zone in the south (brighter colors) with poverty rates down to 14%. Until 2002, and as a consequence of GDP growth, poverty generally decreased in Itapua (more districts with bright colors), but the poor district belt in the north still remained. Some of these poor districts even increased their poverty levels, now up to 72% of the population. The lowest levels in the south were 29% en 2002. So, considering a pro poor policies intervention, maybe by conditional cash-transfer programs to extremely poor households, Itapua might not be selected for program participation, considering department mean levels of poverty. Nevertheless, disaggregating poverty estimates by districts, it happens to be that some of the poorest districts of the country are located in ltapua, being direct neighbors of some of the most prosperous districts in the country. Small welfare estimates help to improve the targeting of pro-poor policies.

To show an example of what precision can be achieved at the district level, Figure 1.6 shows the predicted poverty headcount in rural ltapua for 2002, along with a confidence interval from one standard error below to one standard error above the point estimate. The department of ltapua was selected because covers almost the complete range of standard errors for point estimates observed for the 1992 and 2002 exercise, varying from 0.015 to 0.075.

**Figure 1.6 Rural poverty estimates Itapua 2002** 

Source: Author's calculations based on EPH 2002 and CNPV 2002

Apart from the standard errors for point estimates, regression models have structured and unstructured errors as seen above. To check for spatial patterns of these kinds of errors they are mapped in the annex, as well as relative changes in poverty and inequality.

As mentioned above, there is a set of former poverty map exercises in Paraguay. Robles (2000) combined the 1992 census with 1997/98 survey using the Hentschel et al (2000) method, which differs slightly from the method applied in this paper, so they can not easily be compared. The second poverty map exercise carried out by Otter (2003), using the same method applied in this paper, combines the 2002 survey with a 10% sub-sample of the census and combines the estimated poverty levels with weighted unmet basic needs percentages. Since the data bases are not the same, there may be some difficulties in comparing the results of this paper with the Otter (2003) exercise. Finally, Robles and Santanders (2004) poverty map exercise is most similar to this paper. Using the same method, they combine 2002 census with 2003 household survey data, mostly because the 2003 survey sample allows to run a separate regression model for every department (18 models) and not only 4 different models by region as in 2002. Since poverty rates changed considerably (dropping by 6 percentage points in the national mean) between 2002 and 2003, the best way to compare the results of these two exercises is to compare rankings of districts by poverty level, which should not change strongly, even if poverty percentages decrease considerably. When comparing the rankings, we observe that 64% of all districts are ranked within the same deciles, meanwhile the standard deviation of ranking differences is only 0.94 points. Consequently, the results of both poverty mapping exercises are consistent between each other, and differences should be a consequence of the more detailed estimates by Robles and Santander and poverty changes between these two years.

### **1.3.4 Pro-poor growth evidence**

Although this paper is not about pro-poor growth, its results provide empirical evidence on such growth from poverty map exercises. Even if the existence of pro-poor growth evidence could easily be confirmed from the household survey data, doing this with simulated incomes based on census data will allow the identification of whether there are any spatial patterns in pro-poor growth, for example the concentration of a huge number of households benefiting from pro-poor growth which could be concentrated in a small and limited geographic area.

According to international organizations pro-poor growth is simply defined as economic growth that benefits the poor (e.g., Thomas Otter - 978-3-631-75367-5 UN 2000a; OECD 2001). This definition, however, provides little information on how to measure or how to implement it. What remains to be specified are, first, whether economic growth benefits the poor and, second, if this is the case, to what extent. Klasen (2004) provides more explicit requirements that a definition of pro-poor growth needs to satisfy. The first requirement is that the measure differentiates between growth that benefits the poor and other forms of economic growth, and it must answer the question by how much the poor have been benefited. The second requirement is that the poor must have benefited disproportionately more than the non-poor. The third requirement is that the assessment must be sensitive to the distribution of incomes amongst the poor. The fourth requirement is that the measure must allow an overall judgement of economic growth and not only focus on the gains of the poor.

To identify the existence of pro-poor growth of per capita income according to point estimates from the poverty map exercise, specific inflation rates by region and income deciles were calculated. To obtain these measures as realistic as possible they are based on consumption profiles by deciles, built up as a mean of 1997 /98 and 200/01 consumption profiles 12 ( only during these two periods did Paraguayan household surveys include a consumption module). Table 1.14 shows the deciles specific inflation rates. In general, inflation is lower for lower income deciles and lower for less urban or more rural areas. These results seem to drive the specific results of pro-poor growth.


**Table 1.14 Mean inflation rates by deciles, Paraguay 1997/98** - **2000/01** (%)

Source: Author's calculations based on EIH 1997/98 and EIH 2000/01.

Figure 1. 7 shows the growth incidence curve of log per capita income for constant currency in 1992 values. There is a clear pro-poor growth pattern for deciles 5 to 25. Growth incidence curves produced separately for the four different regions show that there is almost no pro-poor growth in the Asuncion and Central Urban areas, very few in Remaining Urban, but mostly in rural areas.

<sup>12</sup> Carried out by groups of goods and services: food, clothing, housing, health, transport, education, various. Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM via free access

**Figure 1.** 7 **Growth incidence curve of log per capita income** 

Source: Author's calculations based on ECV 1992, CNPV 1992, EPH 2002 and CNPV 2002.

### **1.4 Discussion**

In the regression results, all adjusted R squared are not very high; approximately 0.6. On the one hand, this is due to the fact that only variables whose coefficients are highly significant were included in order to make the models as robust as possible, as explained above. On the other hand, reduced levels or adjusted R squared may result from a considerable number of dummy variables in the models, which may have reduced the power of explanation for probably having lower variance than other kind of variables.

In regression results, it seems that the less homogeneous a population is, the higher the probability of identifying locational components in the error term. If there are locational components in the error term, it is easier to identify them in smaller geographic areas (this is why the Asuncion model produces more significant cluster effects than the other regions in 1992).

If this is the case, the question is now why the 2002 models produce much less significant cluster effects? Could it be true that the population became more homogeneous? Even if poverty and inequality changes between 1992 and 2002 are not big, the real story household survey data tell us that there was a poverty reduction between 1992 and 1997 and an increase between 1998 and 2002. Urban inequality tended to decrease while rural inequality tended to increase over the whole period. Additionally, we have a growing urban migration of poor and a growing urban poverty.

There are two main differences between the 1992 and the 2002 results. First, the 1992 results include much more significant Thomas Otter - 978-3-631-75367-5 cluster effects than those for 2002, which capture part of a locational effect. In 2002, this fact seems to have disappeared. Noticeably, the locational effect is not only or directly related to the geographic location, but is in relation to the population group and their characteristics, living in the observed area, at any moment. Even if the 2002 regression models include less variables and less significant cluster effects, their prediction power for poverty is higher than 1992 while their prediction power for inequality is almost the same as in 1992. Consequently, geographic location seems to be less important for 2002 than for 1992.

Although the income regression models are not modeling determinants, but variables which are correlated with income, most of these have the expected signs. This is also true for the cluster means. The interpretation of these cluster means, which try to capture part of the locational error, is difficult. Nevertheless, for some cluster mean variables there can be a kind of intuitive understanding. For example, in some models the percentage of primary sector employment appears as a significant variable with a negative sign for the household (individual) level. However, its cluster mean also has a positive sign. This may be understood as a positive effect at community level *ceteris paribus* and for the given mean level of income in the region.

At least concerning empirical evidence from Paraguay, the methodology seems to work better for the prediction of higher incomes, since extreme poverty is overestimated (by underestimating lower incomes), as are mean incomes. Consequently, the associated distributions are not the same as those observed in household surveys. Several reasons can be attributed for this. For example, rural incomes (where most of the low incomes are located) depend strongly on climate and market price changes not captured by variables included in a census. Lowest incomes in urban areas may be difficult to simulate correctly due to sampling and measurement problems in household surveys.

Poverty maps show that there is a concentration of poverty in the center of eastern Paraguay (where 98% of the population is living). Changes of poverty during the observation period neither altered significantly the spatial distribution of poverty nor of inequality. In general, structured and unstructured errors are higher in more rural areas. All these results are consistent with Paraguayan rural economic history over the period, with a crisis of small scale cotton cash crop farming and an increase of large scale Soya bean mechanized farming, deepening poverty and inequality in rural areas.

# **1.5 Conclusions**

As shown, the method of Elbers et al is a reliable method for small area welfare estimates, producing poverty point estimates for sub-national levels. Obviously, there are several sources of errors in the methodology and other errors from sampling and measurement problems in the household surveys. Nevertheless, the income estimates are consistent with economic history in Paraguay and, since most of the errors made during the estimation procedure can be quantified, it is possible to determine their reliability. In any case, the gain in additional information is crucial for politics and policies design and implementation.

Poverty analysis is often based on national level indicators that are compared over time or across countries. The broad trends that can be identified using aggregate information are useful for evaluating and monitoring the overall performance of a country. For many policy and research applications, however, the information that can be extracted from aggregate indicators is not sufficient, since these hide significant local variation in living conditions within countries.

The detailed poverty maps of small administrative areas, that are the ultimate output of the simulated-welfare mapping method, provide benefits that help address the shortcoming of aggregate poverty analysis in the following ways:

### *(i) Poverty maps capture the heterogeneity of poverty within a country.*

Almost all countries in the world have regions that are better off and others that are left behind. Such differences are often lost in national level statistics. Poverty maps can reveal the variation in local poverty levels when small area information is available. As shown, seemingly homogeneous regions can actually have a large degree of local heterogeneity.

### *(ii) Poverty maps improve targeting interventions.*

In designing poverty reduction programs, resources can be used more effectively if the most needed groups can be better targeted. This reduces the leakage of benefits from a poverty reduction program to non-poor households and, on the other hand, reduces the risk that poor households will be missed by a program. This requires an adequate targeting to poor areas, but also a correct beneficiary selection.

### *(iii) Poverty maps can help governments* - *national and local* - *to articulate their policy objectives.*

Basing allocation decisions on observed geographic poverty data, rather than subjective rankings of regions, increases the transparency of government decision making. Such data can thus help limit Thomas Otter - 978-3-631-75367-5 the influence of special interests in

allocation decisions. There is a related role for well-defined poverty maps to lend credibility to government and donor decision-making. By increasing transparency, poverty maps can help prevent the regional autonomy policy from being hijacked by the local elite.

### *(iv)Poverty maps have an important role in communicating information on welfare distribution to the civic population within a country.*

Poverty maps are not only useful to governments and decision makers, but also to local communities. Compiling disaggregated information on human welfare generates locally relevant information. This provides local stakeholders with the facts that are required for local decision making and for negotiation with government agencies. Poverty maps thus become an important tool for local empowerment and decentralization.

### *(v) Poverty maps are useful for evaluating the impact of various programs.*

Poverty maps offer opportunities to undertake detailed empirical research on the causal relationships between local poverty, income inequality, and various other social outcomes, both at the individual and community levels. Until now, scarcity of welfare indicators for small areas has prevented researchers from studying the relationship between various programs, poverty, inequality, and various outcomes, such as health, education, crime, and the environment. Poverty maps open up more opportunities for researchers to examine these relationships.

### *(vi) Estimation of small area indicators of poverty allows their incorporation into geographical information systems (GIS).*

This feature of poverty maps facilitates the combination of poverty information with other indicators from policy-relevant subject areas. Examples are geographic databases of transport infrastructure, public service centers, access to input and output markets, or information on natural resources quality and vulnerability. Using geographic overlay techniques and spatial analysis methods, the newly constructed databases on poverty can thus be used to address a range of multidisciplinary questions. The databases can also be used by the private sector to guide them in determining the locations for new investment opportunities.

# **Chapter 2**

# **Does Inequality Harm Income Mobility and Growth?**

### **An Assessment of the Growth Impact of Income and Education Inequality in Paraguay 1992- 2002**

### **2.1 Introduction**

Latin America is the most unequal region of the world in terms of income or expenditure, as well as regarding other aspects of economic or social exclusion. The region suffered the lost decade of the nineteen eighties, and experienced a modest recovery in the nineteen nineties. In the nineteen nineties, most of the governments implemented stabilization politics, more or less close to the proposals of the Washington Consensus. Paraguay itself, however, neither suffered a debt crisis nor a mayor economic instability during the eighties, so the stabilization policies would not have been necessary or useful for the Paraguayan economy and business cycles in the nineties. Nevertheless, many of the macroeconomic policies applied in Paraguay during the nineties were close to the Washington Consensus. The most striking macroeconomic result of the decade was a per capita income decrease beginning in late 1995, hand in hand with a poverty increase after 1996. Given the persistently high levels of poverty incidence in Paraguay to date, understanding the determinants of growth at the household level in Paraguayan economy remains an important but underresearched field in economics. This appears to be particularly true for the question whether inequality has a positive or negative effect on economic growth, a question that is both fundamental in (development) economics and highly relevant for poverty reduction policies. Although the effect of inequality on growth has important implications for poverty (Bourguignon, 2004; Ravallion, 1997), empirical evidence on this link is virtually inexistent for Paraguay. 13

The effect of inequality on economic growth is the subject of a large literature. Aghion et al., 1999 and Thorbecke and Charumilind, 2002 review this literature

<sup>13</sup> Different country analysis on aspects of financial liberalization and openness were run by a research team supported by CEP AL, UNDP and IADB. These studies include Paraguay, focussing on CGE simulation models and their counterfactual effects on households, but these analyses do not consider the effect of inequality on growth (Ganuza, Morley and Taylor 1998; Ganuza, Paes de Barros, Taylor and Vos 2001; Ganuza, Morley, Robinson and Vos 2004).

and show that theory does not provide firm predictions of the sign of the effect. <sup>14</sup> Empirical studies in the 1990s have been " .. impressively unambiguous .. " (Aghion et al., 1999, p.1617) in concluding that the growth effect of inequality is negative, but more recently some authors have obtained contrasting results ( e.g. Forbes, 2000). The most common denominator in these studies is the nature of the data used: the empirical inequality-growth literature is largely based on cross-country data.

This paper contributes to the existing inequality-growth literature by providing empirical evidence that is new in a number of ways. First, the present study is based on micro data for Paraguay. This allows avoiding data comparability problems that affect cross-country studies (see Section 2). While there are a small number of inequality-growth studies using micro data (for example Joeng, 2001, Schipper and Hoogeveen, 2005), this is the first such study for Paraguay. Second, the data used consists partly of the so-called small area welfare estimates, which are obtained by combining information from a census and a survey. For this paper the small area welfare estimates were grouped in a pseudo panel. Third, theoretical and empirical studies have been criticized for their focus on income or expenditure inequality as the determinant of growth. Birdsall and Londono (1997) show that once land and human capital inequality are entered in a cross-country growth regression, income inequality no longer has a significant effect on growth. Elbers and Gunning (2004) address this issue theoretically using a Ramsey type household growth model and show that growth is affected by 'underlying' inequalities in assets, abilities and shocks. In particular, these authors show that higher 'ability' (human capital) inequality will positively affect growth if the production function is convex in ability. In that case, a meanpreserving spread in human capital results in a higher mean steady state level of output, and therefore in higher growth. In this paper, this issue is explicitly addressed by estimating the growth effect of inequality in human capital. The results indicate that it is income inequality rather than human capital inequality that affects growth and that this effect is negative. Nevertheless, there are also positive growth effects of human capital inequality, some less strong than income inequality results.

<sup>14</sup> Positive inequality-growth effects can be attributed to a positive effect on savings, to the existence of investment indivisibilities or to positive incentive effects of inequality. A negative inequality-growth effect can be explained by political tension, instability and demands for redistribution due to inequality, by reduced investment opportunities for the poor, worsened borrowers' incentives and by higher macro-economic volatility. A 'unified' model that aims to reconcile these conflicting effects is presented in Galor (2000); this paper predicts that the effect of inequality on growth is non-linear, with a positive effect at an 'early stage of economic development' and a negative effect at a 'later stage'. Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM via free access

The paper is organized as follows. Section 2 discusses the existing empirical inequality-growth and some of the income mobility literature. Small area welfare estimates as an alternative source of data for this type of investigation are briefly described. In Section 3 the growth model and descriptive statistics are presented. Section 4 presents a discussion of econometric issues that need to be addressed given the model in use and, in particular, given that some of the variables have been imputed using small area welfare estimation. Results are presented in Section 5 and conclude in Section 6.

### **2.2 Data: macro, micro and small area welfare estimates**

Cross-country inequality-growth studies, while providing considerable empirical evidence, have been criticized for various reasons. A general problem with both macro (cross-country) and micro growth studies is the 'open-endedness' of the underlying theory: many variables potentially affect growth and theory may often not give clear guidance as to which specification is preferable. Data used in cross-country studies are national aggregates that are likely to lose valuable region or gender specific information (Deininger and Okidi, 2003). Brock and Durlauf (2001) reject causal interpretations in cross-country studies except under considerably exceptional conditions. Their main argument is that causal interpretation requires that estimated parameters can be assumed constant, which is not plausible given the importance of country-specific unobserved information ( e.g. regarding policy).

Comparability of variables that are intended to measure the same concepts across countries is a further issue in cross-country studies. This is particularly problematic for cross-country inequality data (Atkinson and Brandolini, 2001). An issue that has not received much attention in the literature is that, even when variables are defined and measured in exactly the same way, national statistics (including GDP) are often estimates derived from, for instance, national household surveys - as is the case for inequality estimates. Even if these estimates are representative at the national level, they are still point estimates with a standard error, a fact that the analyst has to take into account when doing regressions: one should expect that properly accounting for the uncertainty with respect to these estimates, reflected by their standard error, translates into higher standard errors in the growth regression coefficient estimates. This problem is equivalent to the one encountered when using small area welfare estimates in regression analysis, as is discussed in detail in Section 4.

A problem with household data is that only surveys for very large countries provide sufficient data points to meaningfully include inequality indicators in a regression while census data typically do Thomas Otter - 978-3-631-75367-5 not provide the income or wealth Downloaded from PubFactory at 01/11/2019 05:51:05AM

variables and covariates needed in a growth regression. As a result, only a small number of inequality-growth studies that use micro or regional data remain. Ravallion, 1998 estimates a linear household level growth model with local externalities and finds a significant negative effect of inequality for rural China. Balisacan and Fuwa, 2003, find a positive effect of inequality on provincial level growth for the Philippines, using a linear model. Schipper and Hoogeveen (2005), using downstream regressions for Uganda, found that it is human capital inequality rather than income inequality that affects growth and that the effect is positive.

An important advantage of regional or household data is that comparability problems are much less severe than in cross-country datasets: the definitions of variables or phrasing of survey questions are generally uniform across regions for a given dataset. Depending on the level of desegregations, regional analyses may also be able to use larger numbers of observations than cross-country analyses; household growth studies are especially advantaged in this sense.

Until recently, the unavailability of nationwide inequality data covering a larger period precluded the study of the inequality-growth relation for Paraguay15. However, the application of welfare estimation techniques for small area target populations has recently provided income estimates for all households in Paraguay for 1992 and 2002 (see Chapter 1). This now allows the study of the inequality-growth relation for Paraguay.

### **2.3 Small area welfare estimation**

Part of the data used for this paper is derived using small area welfare estimating techniques first described in Hentschel et al., 2000 and refined in Elbers et al., 2003; the latter paper is referred for details of the technique and provide a brief review below.

Small area welfare estimation combines data from a census and a household survey in a three-stage process. First, a set of variables that are common to the survey and the census are identified. Second, household per capita expenditure is regressed on these common variables using the household survey data and census means obtained for the clusters from which the survey households originate; this yields coefficient estimates with the associated variance-covariance matrix and estimates of the distribution of household and cluster error terms. Third, out

<sup>15</sup> The first nation wide inequality estimates in Paraguay are based on the household survey of 1992 (carried out by IADB and the National University). Only as from 1998 does the National Statistics Bureau (DGEEC) provide annual updates of nation wide household surveys. Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM

of sample prediction on unit record census data is used. Predicted values are calculated typically 100 times, each time drawing variable coefficients and household specific and cluster level error terms from the relevant distributions. This yields, for each household in the census, predicted per capita income and its standard error. A close correspondence between census and survey household characteristics is needed to obtain reliable welfare estimates. 16 For this reason, small area welfare estimates have typically only been generated for the years close to a census year. Hoogeveen et al. (2003) show how, in the presence of panel survey data for which one of the waves was collected at the time of the census, the welfare estimates can be updated by associating household characteristics collected during the census year, with expenditures obtained for a more recent period. Since panel surveys do not exist for Paraguay, and since the analysis of the present paper is based on two different censuses, the inequality and growth analysis is based on a pseudo panel build up from income estimates for each household in each census.

More formally written small area welfare estimates can be estimated by using the following model:

$$\ln \quad \mathcal{Y}\_{\mathfrak{c}\mathfrak{a},\ t\star\mathfrak{l}} = E\left[\ln \mathcal{Y}\_{\mathfrak{c}\mathfrak{a},\ t\star\mathfrak{l}} \mid \mathcal{X}\_{\mathfrak{c}\mathfrak{a},\ t\downarrow}\right] + \eta\_{\mathfrak{c},\ t\star\mathfrak{l}} + \mathfrak{e}\_{\mathfrak{c}\mathfrak{a},\ t\star\mathfrak{l}} \tag{2.1}$$

where subscript *t,* survey households is represented with subscript *s,* census households is represented with subscript *h,* and the cluster from which census and survey households originate is represented with subscript *c.* 

Predicted log per capita expenditure is now derived, for each household in the census, from: - *<sup>T</sup>*-

$$\ln \tilde{\mathcal{Y}}\_{ch, \,\iota+1} = \boldsymbol{X}\_{ch, \,\iota}^{\boldsymbol{\tau}} \vec{\mathcal{B}} + \,\, \tilde{\eta}\_{\boldsymbol{c}} \,\, \, + \,\, \tilde{\mathcal{E}}\_{\boldsymbol{c}, \,\iota+1} \tag{2.2}$$

and welfare estimates are based on:

$$\tilde{\mu}\_{\iota+1} = \,^E E \left[ W\_{\iota+1} \mid m\_{\iota}, \, \, \tilde{\nu}\_{\iota,\iota+1} \right] \tag{2.3}$$

<sup>16</sup> Much attention is therefore devoted to identifying common variables by assuring that variable definitions are identical between the census and the survey, that questions are phrased the same way, that coding and enumerator instructions are identical and that the survey and census are fielded contemporaneously. When the latter condition is not met -and this is more of a problem in rapidly changing economic environments-, changes in the economic situation will be reflected in household characteristics. As a result, survey variables identified as common to the census, are actually not representative of the census and small area welfare estimates can not be derived. Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM

Once predictions are made using (2.2) welfare estimates can be generated for any administrative unit, but their precision decreases with the degree of disaggregation. For Paraguay, accurate welfare estimates coming from household surveys are available for three levels; nation wide, by urban or rural area and by region. <sup>17</sup>

Our analysis makes use of two data sets: unit record data from Paraguay's 1992 population census, combined with 1992 household survey. Small area welfare estimates for all households are carried out. The same exercise is carried out with the 2002 population census and the 2002 household survey. The 1992 census was carried out in August 1992 and covers 526,050 urban households and 454,342 rural households. The 1992 household survey was carried out between October and December 1992. The 2002 census was carried out in August 2002 and covers 782,966 urban households and 505,567 rural households. The 2002 household survey was carried out during November and December 2002. Both censuses comprise, for all household members, information on household composition, ethnic background, marital status and educational attainment. Growth and inequality variables are calculated using the income values prepared for Chapter 1. The author shows that the income estimates for 1992 and 2002 are unbiased and closely correlated estimates of the 'true' welfare estimates derived from the national household surveys. Estimates of income and inequality were derived for all 224 districts <sup>18</sup>of Paraguay for both years. Based on comparable household income data, they represent the first data set for Paraguay with comparable inequality estimates for two points in time for a substantial number of observations.


**Table2.1 Welfare estimates, Paraguay, Selected Years** 

Note: column entries are regional means of district estimates.

Source: Author's calculations based on results in Capter l

<sup>17</sup> That is, the ratios of mean values to standard errors are about the same as those obtained in household surveys.

<sup>18</sup> The Paraguayan "distrito" is a municipality, the smallest existing administrative unit. Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM via free access

A summary of the welfare estimates used in this paper is presented in Table 2.1. The Table confirms that on average poverty increased over the 1990s, except for the Remaining Urban region. Also, the increase in poverty was not distributed uniformly. Asuncion and Central Urban were the most affected regions. At the same time, inequality decreased where poverty increased and vice versa. Mean income increased in all regions except Asuncion. Even if this seems contradictory it is consistent with the macroeconomic history of Paraguay over the decade, with a growth period and poverty reduction until 1997. During this period, in general, income increased and inequality decreased. In the following period of recession (1998 to 2002) characterized by income decrease, not all of these mean income increases and inequality decreases were lost. Nevertheless poverty rose by means of the appearance of an important number of "new poor".

At first, it seems to be contradictory that we observe a simultaneous income growth associated with a poverty increase and an inequality reduction. Poverty can increase despite income increase, so long as prices grow quicker than income (so poverty lines rise faster) or whether there are any other problems with the poverty lines, as such. In Paraguay, poverty is defined by four different poverty lines for the Asuncion, Central Urban, Remaining Urban and Rural areas. Official poverty lines are updated yearly by an official inflation measurement that is limited to the Asuncion and Central Urban areas. To apply this inflation data to the other two areas, an implicit Engel coefficient based on a consumption profiles measurement not updated since 1998, is applied. This methodology seems to create some bias in the poverty lines. The inequality decrease associated with poverty increase results from general income loss after 1998, where higher income groups suffer stronger losses than lower ones, resulting in decreasing inequality (more on this in chapter 3).

Since pseudo panels are used for the analysis, the results could also be interpreted as an indicator for income-mobility, since the growth rates of estimated mean household per capita income between 1992 and 2002 at a district level are used as dependent variables. However, since education inequality is used as one of the independent variables, we also have notions of human capital in the analysis. This brings the results close to the link between growth, inequality and social mobility.

One of the primary motivations for economic mobility studies is to gauge the extent to which longer-term incomes are distributed more or less equally than are single-year incomes. Krugman, for instance, stated: "If income mobility were very high, the degree of inequality in any given year would be unimportant, because the distribution of lifetime income would be very even ( ... ). An increase in income mobility tends to make the Thomas Otter - 978-3-631-75367-5 distribution of lifetime income more

equal" (Krugman, 1992). Similar statements have been made by Shorrocks (1978), Atkinson, Bourguignon, and Morrisson (1992), Slemrod (1992), and Jarvis and Jenkins (1998).

Social mobility and income inequality together describe the "fairness" of an income distribution. If income is very unevenly distributed and social mobility is low, then there is a large gap between rich and poor and there is little chance of crossing that gap. However, since social mobility might me related to education, who has more mobility, better-educated individuals or less-educated people? The answers may depend on the mobility concept used. In the *intergenerational*  context, the recipient unit is the family, specifically a parent and a child. In the *intragenerational* context, the recipient unit is the individual or family at two different dates. The pseudo panel used in this paper refers to an intergenerational model, but the observation period is not a whole generation, but only a ten year difference.

The literature distinguishes between six notions of mobility (Fields et al 2006, Scott and Lichtfield 1994). Briefly, they are: *time-dependence,* which measures the extent to which economic well-being in the past determines individuals' economic well-being at present; *positional movement,* which is what is measured when looking at individuals' changes in economic positions (ranks, centiles, deciles, or quintiles); *share movement,* which arises when individuals' shares of the total income change; *income flux,* which is what is gauged when looking at the size of the fluctuations in individuals' incomes but not their sign; *directional income movement,* which is what we measure when we determine how many people move up or down per amount of dollars; and *mobility as an equalizer of longer-term incomes,* which involves comparing the inequality of income at one point in time with the inequality of income over a longer period. If the results of this paper might be understood as an income mobility indicator, the study belongs in part to time dependence (because it considers initial levels of income and education inequality) and in part to positional movements (because it asks if there was some pro-poor growth).

Several papers show how the allocation of talent in an economy is important for the level of growth. Murphy, Shleifer, and Vishny (1991), for example, show that when talented people are attracted to the productive sector, they create high growth, but if they instead are attracted to rent seeking activities, they create stagnation. However, the use of talent needs the opportunity to be developed and exposed by a formal educational process.

Two papers have theoretically analyzed the relationship between social mobility and economic growth (Raut 1996; Hassler and Thomas Otter - 978-3-631-75367-5 Mora 1998). They both arrive to

the conclusion that high social mobility is associated with higher economic growth, but the direction of causality and the transmission mechanisms between mobility and growth differ slightly between the models. Raut (1996) develops a signaling model of endogenous growth in which innate talents and education levels of workers drive the basic scientific knowledge accumulation in the economy. The second study is by Hassler & Mora (1998). They analyze an economy with two types of individuals: workers and entrepreneurs. Entrepreneurs are those who generate new ideas and new technologies and make the economy grow. The more intelligent the entrepreneurs the higher the growth rate of the economy.

The implication of the above mentioned studies is that to achieve optimum growth it is important that people get the opportunity to work in the sectors where they are most productive. This requires that young people's educational and occupational choices be determined by talent and not limited by family background. Linking these ideas to the model used in this paper, initial income level could be a proxy for family background and initial education level as institutional opportunities to develop talent (which is supposed to be distributed randomly, in spite of the fact that educational levels are usually strongly determined by family background).

### **2.4 The model**

For estimating yearly per capita income growth over the period 1992 - 2002 we build up a pseudo panel at the district level, to be able to compare 1992 and 2002 results. The pseudo panel takes into account the age of the household head (3 year steps), his years of schooling (3 year steps), his mother tongue (as a proxy for ethnicity), the district of residence and the condition of migration ( only non-migrant households are included). <sup>19</sup>Groups with common characteristics in 1992 and 2002 with more than 29 observations were considered for the model. Only non-migrant households entered the model. This is, on the one hand, because migration is not an important phenomenon over the

<sup>19</sup> The idea of excluding migrant households is based on the fact that, even if a pseudo panel is used for this exercise, it is still possible to identify locational or district effects. A "pure" district effect would only be found when considering non-migrant households, even if there are also arguments for including them, such as "pull and push" factors that make a certain district more or less attractive. Either way, migration levels in Paraguay over the nineties were not huge (only 8% of population older than 15 years moved from one department to another between 1992 and 2002, and only 75% of these are non-poor) (Otter, 2007). Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM

whole period20 and, on the other, to analyse growth determinants we can focus on the change of real conditions in each district, which are not biased by changes due to migration. Final estimates were carried out for five different models21 ; Asuncion (471 panels), Central Urban (655 panels), Remaining Urban (762 panels), Rural (2388 panels) and pro-poor-growth households (1300 panels) sample which includes all groups from any region living below the poverty line but having experienced positive income growth. The purpose of this pro-poor-growth panel is to identify if there are any spatial patterns related to the geographic location of pro-poor growth. A separate panel for pro-poor-growth additionally allows us to identify if there are differences in household, family group or household heads characteristics between poor households with and without income growth. Nevertheless, this last step of the analysis was not carried out in this paper.

Estimate growth effects using a pseudo panel can eventually be problematic. All households in a panel, even if they are different between each other, have to observe the same panel mean income change; this can cause problems of heteroskedasticity. Even if all households grouped together ought to be similar, some differences still remain. Not all sources ofheteroskedasticity can or should be captured via a relationship with an independent variable. For example, using grouped data leads to heteroskedasticity if the groups are not all the same size. In this case the error variances are proportional to the group sizes. Using weighting factors could be a solution for this problem. In our case, households are the elements composing panel groups. Every household enters the panel with "size one", since characteristics of the household head are used as grouping criteria. Since this paper uses census data, no weighting factors are used. All size differences between groups reflect reality and should be taken as such since all households in the country are considered ( only migrant households are left out of the analysis). For all five models to be run, panel groups contain between 30 and up to 1000 households. Nevertheless, in all cases, panels including between 30 and 250 households cover more than 90% of all observations. The distribution of these panel groups by size is almost the same. So if there is a hetersokedasticity problem caused by different panel sizes, it would be a systematic one.

In the model we estimate yearly per capita income growth of each panel group, over the period 1992 - 2002 as a function of, for 1992, per capita income, income inequality, human capital inequality, male and female human capital

<sup>20</sup> As a result, there are very few or no panels by district which fulfill the conditions of identical characteristics and more than 29 observations in the panel.

<sup>21</sup> Since income estimates in Otter (2006) were carried out for four different regions, each of these with its own poverty line, the growth analysis is based on the same regions as well as a growth analysis for poor households. Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM

household demographics and employment sector. The model we estimate can be represented as:

$$\mathbf{g}\_{i,d} \triangleq \left(\tilde{\mathbf{y}}\_{i,92} - \tilde{\mathbf{y}}\_{i,92}\right) / 10 = \boldsymbol{\uptheta}\_{i,92} \boldsymbol{\upbeta}\_{l} + \dot{\boldsymbol{I}}^{\text{exp}}\_{i,92} \boldsymbol{\upbeta}\_{l} + \boldsymbol{I}^{\text{edu}}\_{i,92} \boldsymbol{\upbeta}\_{l} + \mathbf{X}\_{i,92} \boldsymbol{\upgamma} + \boldsymbol{\upalpha}\_{d} + \boldsymbol{u}\_{l} \quad \text{(2.4)}$$

With the exception of the Gini coefficients, which are district averages, all other values are averages by panel *i: g* is the annual income growth rate between 1992 and 2002; *y* is the logarithm of l'er capita income; r"P is the Gini coefficient for per capita household income; ru is the Gini coefficient for the number of years of formal education of the household head. **X** is a matrix of other covariates consisting of human capital (number of years of formal education entered separately for household heads and for spouses), head age, gender of the household head, logarithm of the number of individuals in each household, number of children and dummy variables for employment sectors, changes of some of these variables (which are likely to be endogenous) and some departmental dummies. <sup>22</sup> Given this approach, we are limited in our choice of covariates in X to what the census has to offer. District fixed effects, represented by a..i, to control for unobserved spatial heterogeneity; *u;* is an error term used.

A non-standard econometric issue lies in the fact that some of the variables are not observed but imputed as described in Section 3. The imputed variables, income growth and income inequality, are denoted using tildes. See Table 2.2 for definitions and summary statistics.

An important issue in regional growth studies is the measurement of the dependent variable. In our case, the smallest available geographical subdivision in the database is the district, and within the district, households are grouped in panels. Growth for a panel *i* is usually specified as:

$$\mathbf{g}\mathbf{r}\_i = \frac{\mathbf{y}\_{i,t} - \mathbf{y}\_{i,0}}{t} \tag{2.5}$$

where *y* is a measure of panel income or expenditure. This measure is often specified as the logarithm of the mean of per capita income over households *h*  for group *i* (e.g. in Balisacan and Fuwa, 2003), i.e.:

$$\chi\_l = \ln \left( \frac{\sum\_{h=1}^{H} \chi\_{h,l}}{H} \right) \tag{2.6}$$

<sup>22</sup> Potential changes in employment sectors could be considered proxies for structural changes in the productive sector. Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM via free access

However, as pointed out by Ravallion, 1998, the use of the logarithm of mean expenditure rather than the mean of log expenditure introduces a measure of the change in inequality in the error term of the regression equation. The argument is as follows: a general inequality measure is

$$I(\boldsymbol{\chi}\_i) = \ln \mathcal{M}(\boldsymbol{\chi}\_i) - \mathcal{M}(\ln \boldsymbol{\chi}\_i) \tag{2.7}$$

where /(.) is an inequality measure and *M(.)* denotes an average. Rearranging these terms we have:

$$\ln M(\mathcal{y}\_i) = M(\ln \mathcal{y}\_i) + I(\mathcal{y}\_i) \tag{2.8}$$

The LHS of (2.8) is the income of (2.6). However, if we think that the log of household income is the variable of interest we should use:

$$\mathbf{y}\_{\boldsymbol{\lambda},t} = \frac{\sum\_{\boldsymbol{\star}=1}^{N} \log(\boldsymbol{y}\_{\boldsymbol{\star}})}{N} \tag{2.9}$$

which is the first term in the RHS of (2.8). It is clear from (2.8) that we introduce a measure of inequality if we use the log of mean incomes as our regional income variable. Consequently, we introduce as measure of the change in expenditure inequality in our growth variable if we calculate mean expenditure using (2.6).

In an inequality growth regression, this is likely to introduce a correlation between the error and the inequality variable which will affect estimates through omitted variable bias. For example, consider the case where increases in inequality have a negative effect on growth, while the level of (initial) inequality has a positive correlation with the change in inequality. Then omitting the change in inequality will cause a spurious (negative) effect of inequality on growth (Ravallion, 1998). Since we have access to household level per capita income estimates aggregated by pseudo panels, it could be useful comparing the estimates of a growth regression using both types of dependent variable (mean-log(exp) and those using log-mean(exp)). Nevertheless, this comparison is still pending and has not yet been carried out. In this paper, only the mean log income is used in the regression models.

Does inequality harm growth? 63


**Table 2.2 Variables and descriptive statistics - Asuncion** 

Note: All observations are panel (sub-district) means of the household values of the variables mentioned, with the exception of the Inequality measures, which are district means. No. of observations: 471.

Source: Author's calculations based on results of income estimates in Chapter l. Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM


**Table 2.3 Variables and descri~tive statistics - Central Urban** 

Note: All observations are panel (sub-district) means of the household values of the variables mentioned, with the exception of the Inequality measures, which are district means. No. of observations: 655.

Source: Author's calculations based on results of income estimates in Chapter 1.


**Table 2.4 Variables and descri2tive statistics - Remaining Urban** 

Note: All observations are panel (sub-district) means of the household values of the variables mentioned, with the exception of the Inequality measures, which are district means. No. of observations: 762.

Source: Author's calculations based on results of income estimates in Chapter 1. Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM via free access


**Table 2.5 Variables and descri~tive statistics - Rural** 

Note: All observations are panel (sub-district) means of the household values of the variables mentioned, with the exception of the Inequality measures, which are district means. No. of observations: 2,388.

Source: Author's calculations based on results of income Thomas Otter - 978-3-631-75367-5 estimates in Chapter 1. Downloaded from PubFactory at 01/11/2019 05:51:05AM via free access


**Table 2.6 Variables and descrietive statistics** - **Pro-Poor-Growth-Panels** 

Note: All observations are panel (sub-district) means of the household values of the variables mentioned, with the exception of the Inequality measures, which are district means. No. of observations: I ,300.

Source: Author's calculations based on results of income estimates in Chapter I.

### **2.5 Estimation**

Before discussing the results obtained in regressions it is necessary to make sure that these results can be taken as true. There could be some important bias in the results, given that the independent variable was estimated and not observed. The properties of estimators obtained from downstream23 regressions using imputed values for welfare indicators are investigated in Elbers et al., 2005. Their main proposition is that coefficients from regressions involving imputed welfare indicators which have been derived from small area estimation techniques, either in the LHS or in the RHS, do not differ systematically from regressions with true indicators ('real data'). The intuition for this consistency result is that imputed variables can be regarded a special kind of instrumental variables and may

<sup>23</sup> It is convenient to refer to our inequality-growth regression as a 'downstream' model so as to distinguish it from the 'upstream' expenditure model which has been used to generate the imputed values. Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM

therefore be safely used in estimation. We briefly explore the issues involved in estimation for the general case with imputed values in both the LHS and the RHS of a regression equation.

We consider a simple version of our downstream regression model (omitting inequality measures):

$$\mathbf{g}\_i = \mathbf{y}\_i \boldsymbol{\beta} + \mathbf{x}\_i \boldsymbol{\gamma} + \boldsymbol{\omega}\_i \tag{2.10}$$

The dependent *g* and the independent *y* are obtained from upstream imputation; in what follows, imputed variables have tildes in order to distinguish them from 'true' values or observations. Writing imputed values as the difference between 'true' values and an error term, *g* = *g* - *<sup>w</sup>*and ji = *y* -.; , we obtain:

$$\tilde{\mathbf{g}}\_{i} = \tilde{\mathbf{y}}\_{i}\boldsymbol{\beta} + \left(\xi\_{i}\boldsymbol{\beta} - \boldsymbol{\omega}\_{i}\right) + \mathbf{x}\_{i}\boldsymbol{\gamma} + \boldsymbol{u}\_{i} \tag{2.11}$$

The 13 coefficient can be consistently estimated provided that (a) the imputed values *g* and ji are consistent estimators of the conditional expectation of the true welfare measures and (b) the error terms.; and *w* are uncorrelated with the regressors ji and x.

Elbers et al. (2005) show that when small area welfare estimates are used (a) is satisfied and (b) is likely to be satisfied. To see the latter, first note ji is imputed Per Capita Income (PCI) or a non-linear measure calculated from PCI, e.g. inequality.24 Both ,; and *OJ* are prediction errors and are thus orthogonal to the predicted values *y* and *g,* respectively. Moreover, since *y* and *g* are based on the same prediction model, the prediction errors should be orthogonal with respect to both *y* and *g* .<sup>25</sup>

The prediction errors should also be uncorrelated with regressors in x: since the upstream modeling process makes use of as many available instruments as possible, these regressors will have been considered as instruments in the upstream PCI prediction model, ruling out the presence of any remaining correlation.

However, a correction of the estimated standard errors of the coefficients is necessary because the (upstream) imputation process creates correlation between the welfare estimates. Following Elbers et al. (2005), the prediction error of imputed variables, e.g. expenditure, can be decomposed as:

<sup>24</sup> Other variables could in principle be imputed or predicted as well; however, we consider PCI imputations.

<sup>25</sup> This holds *a fortiori* when either *y* or *z* is a non-linear transformation of PCI or its distribution, such as a poverty or inequality measure. Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM via free access

Does inequality hann growth? 69

$$\mathcal{L} = \mathcal{y} - \tilde{\mathcal{y}} = [\mathcal{y} - E(\mathcal{y})] + [E(\mathcal{y}) - \tilde{\mathcal{y}}] \tag{2.12}$$

where *E(y)* is the conditional expectation of expenditure. The first term in the RHS of (2.12) is termed the idiosyncratic error, which is due to unobserved factors that determine expenditure, and the second part is the model error, which reflects uncertainty about the upstream model's parameters. Applying this error decomposition to both *g* and *y* (2.11) can be written as

$$\begin{split} \tilde{\mathbf{g}}\_{i} &= [\tilde{\mathbf{y}}\_{i}\boldsymbol{\beta} + \mathbf{x}\_{i}\boldsymbol{\gamma}] + [(E(\mathbf{y}\_{i}) - \tilde{\mathbf{y}}\_{i})\boldsymbol{\beta} - (E(\mathbf{g}\_{i}) - \tilde{\mathbf{g}}\_{i})] \\ &+ [(\mathbf{y}\_{i} - E(\mathbf{y}\_{i})) \boldsymbol{\beta} - (\mathbf{g}\_{i} - E(\mathbf{g}\_{i})) + \boldsymbol{u}\_{i}] \end{split} \tag{2.13}$$

The RHS of the equation consists of three parts, each in square brackets. First we have a structural part consisting of imputed and non-imputed regressors and their respective coefficients. The second part represents the model error, the third part the sum of upstream idiosyncratic error and downstream error.

We simplify notation by rewriting these three parts as g, = z; >. + *q,, +e1* where z • = *(y* **,x)** represents all regressors, both observed and imputed, and **A** = (P,y); *cp*  represents the 'model part' of the error and *e* the idiosyncratic part. Assuming that the idiosyncratic part of the error is i.i.d., the variance matrix of the OLS coefficient estimates of (2.13) is:

$$\mathcal{V}\left(\lambda\right) = \sigma\_\epsilon^2 \left(\mathbf{Z}^\prime \mathbf{Z}\right)^{-1} + \left(\mathbf{Z}^\prime \mathbf{Z}\right)^{-1} \mathbf{Z}^\prime \mathcal{V}\left(\boldsymbol{\varphi}\right) \mathbf{Z}\left(\mathbf{Z}^\prime \mathbf{Z}\right)^{-1} \tag{2.14}$$

where the model part variance is:

$$V(\boldsymbol{\varphi}) = \beta^2 V(E(\boldsymbol{y}) - \tilde{\boldsymbol{y}}) + V(E(\boldsymbol{y}) - \overset{\frown}{\mathbf{g}}) - 2\beta Cov[(E(\boldsymbol{y}) - \tilde{\boldsymbol{y}}), (E(\mathbf{g}) - \overset{\frown}{\mathbf{g}})] \text{ (2.15)}$$

Equation (2.14) shows that, compared to OLS variance estimates, variance has to be adjusted upwards. As (2.15) shows, this adjustment depends on the variance in the model error. The more imputed variables are used, the more terms will have to be added: with *n* imputed variables, the number of terms in the RHS of (2.15) equals *n* variance terms plus *n(n-1* )/2 covariance terms. For example, if one imputed variable is used in the RHS only, the adjustment is limited to the first term. In our regression model (equation (2.4)), two imputed variables are used in the RHS, one in the LHS.

In sum, using imputed values of expenditure or other welfare (inequality) measures will lead to unbiased regression estimates. The coefficients of a model like equation (2.4), may be estimated using OLS under the assumption that the idiosyncratic prediction errors and the error term *ui* are i.i.d.

Two additional econometric problems affect our growth model. First, Caselli et al., 1996 show that estimating a cross-section growth model using a fixed effects estimator will lead to substantial bias when the number of periods is small, especially on the coefficient for initial income *(y92),.* The empirical growth literature suggests a number of solutions to this problem, most notably the Arrelano-Bond estimator. Such estimators, however, need at least three periods to estimate the model, using the first period to instrument for the initial conditions of the second period which explain growth between periods two and three. Since we only have two periods, we cannot follow this approach. However, although the bias on the 'convergence coefficient' may be significant, Monte Carlo experiments indicate that the bias in the other RHS coefficients tends to be small (Forbes, 2000).

The second problem is endogeneity. Even though our model does not contain 'flow' variables but only beginning-of-period 'stocks', initial expenditure *y91* has been used to construct the growth variable and is thus correlated with the error term. The same could be true for the changed variables on the RHS. Initial inequality may also be an endogenous variable, as the literature suggests that growth affects inequality (e.g. Aghion et al., 1999; Lundberg and Squire, 2003). One would expect this to be more problematic for changes in inequality rather than for initial inequality. Put to scrutiny, a Hausman test rejects exogeneity of expenditure inequality, but cannot reject exogeneity of education. Consequently, we deal with the endogeneity of initial expenditure and expenditure inequality.

Since we do not have lagged values, e.g. *y,\_,,* to use as instruments, we have to find instruments amongst the (few) available sub-county census means. We have chosen the following instruments. In the Asuncion regression, the instrument for income is a variable that measures the 'education deficit' (the number of years of schooling missed) of children below the age of 13. The (initial) education deficit for children in this age group is strongly negatively correlated with initial income, but arguably, does not affect growth in the period analysed. The instrument for income inequality is the 'ethnic fractionalization', which is the probability that any two citizens randomly chosen from panel population are from different ethnic groups. For the Central Urban regression the instruments are the same as for Asuncion. For the Remaining Urban regression the instruments are "education deficit", as before, for income and for inequality a dummy indicating whether the household head is working in agricultural sector is used. For the rural model, once again the "education deficit" is instrument for income and the number of children is used as an instrument for inequality. The Pro-Poor-Growth Panel regression instruments are the same as for the Asuncion regression.

We tested the validity of the instruments by including them in the different models. They do not alter the other coefficient estimates Thomas Otter - 978-3-631-75367-5 in any significant way. Finally, we note that the instrumentation also affects the calculation of the model's variance: imputed endogenous variables have to be instrumented first and then instrumented values are used in the calculation of the variance-covariance matrix *V(<p).* 

### **2.6 Results**

The estimated standard errors in all our regressions are adjusted to account for prediction errors following the approach outlined in Section 5. The adjustments - illustrated for the baseline equation are found in Tables 2.7 to 2.11.


**Table 2.7 Variance ad"ustments -Asuncion** 

Source: Author's calculations based on results in Table 2.2


**T a e bl 2 8 V** . **ar1ance** ' **a d" 1.1us t men t s** - **Ct en ra IUb ran** 

• Variable lost s1gmficance m 2SLS estunatlon at 10% level.

Source: Author's calculations based on results in Table 2.3



• Variable lost significance in 2SLS estimation at I 0% level. Source: Author's calculations based on results in Table 2.4

Does inequality hann growth?


**T a bl** e **210 V ar1ance** . **a d" 11ustments** - **R ura** 

• Variable lost significance in 2SLS estimation at 10% level.

Source: Author's calculations based on results in Table 2.5


**Table 2.11 Variance ad"ustments- Pro-Poor-Growth Panels** 

• Variable lost significance in 2SLS estimation at 10% level. Source: Authors's calculations based on results in Table 2.6

In all four regions and in the poor household sample, the result is an increase in estimated standard errors for all coefficients. The last column of Tables 2.7 to 2.11 gives the ratio of the adjusted standard error estimates to the standard 2SLS estimates. The increase varies over coefficients between a factor 0.5 and up to 22.3, considering all variables that did not lose significance in the 2SLS estimation. The results in Tables 2. 7 to 2.11, illustrate the general decrease in significance when taking into account the fact that estimates or predictions, and not data, are used. In many cases the adjustment even 'destroys' a significant result, that is, causes the significance level to increase to over ten percent. This is the typical trade-off when analysing small area welfare estimates: the gain in the number of 'observations' obtained by using imputed variables is partly offset by the loss in precision due to (downstream) model prediction errors.

The main findings are presented in Tables 2.12 to 2.16 in a series between seven and ten regressions. They were separated in different regression models, because the estimation of income is based on different models as well.

In the 2SLS regression the complete Asuncion model (regression 1) loses quality. Adjusted R squared decreases from 0. 748 (Table 2.12) to 0.601 (Table 2. 7), and the models standard error increases from 0.008 (Table 2.12) to 0.011 (Table 2.7). All variables except one have the expected signs. Even if total number of individuals per household decreased during the observation period, for Asuncion we get a negative sign for this change, significant at 1 % level in all specifications.

Conditional convergence is pronounced in all specifications: the coefficient on initial income is negative, highly significant and has a value of around -0.05 in all specifications. Apparently, sub-district panels with lower mean per capita income in 1992 have grown faster over the 1990s, *ceteris paribus.* However, note that the coefficient estimate is biased, so we should not attach significance to its exact value.

We have interesting and consistent results for growing primary sector employment of the household head, which ends up harming growth and a growing tertiary household head employment that benefits from growth. In three out of four specifications we find that decreasing household head education harms growth and surprisingly that female-headed households are better off, regarding their growth capacity in all specifications. Household heads education, age, spouses' education and changes in the number of children have very small effects.

The main variable of interest, inequality, has been entered using income inequality (gini). For Asuncion, education inequality Thomas Otter - 978-3-631-75367-5 is correlated with income inequal-

ity and was left out. The results show that income inequality (gini) has a significant negative effect on growth in all specifications. The change in income inequality (income inequality decreased in Asuncion) has a significant but negative effect only in model 6. The positive effects of a decrease in education inequality are up to three times stronger than the positive effect of an initial education inequality (considering standardized coefficients). In the 2SLS regression the complete Central Urban model (regression I) loses quality. Adjusted R squared decreases from 0.889 (Table 2.13) to 0.699 (Table 2.8), and the models standard error increases from 0.007 (Table 2.13) to 0.Ql I (Table 2.8).

Conditional convergence is pronounced in all specifications: the coefficient on initial income is negative, highly significant and has a value of approximately - 0.07 in all specifications.

All variables except one have the expected signs. Even if the total number of individuals per household decreased during the observation period for Central Urban area, we get a negative sign for this change, significant at 1 % level in all specifications.

The main variable of interest, inequality, has been entered using income inequality and education inequality; these variables have been entered in linear · and quadratic form in alternative specifications. Income inequality has a negative effect on growth, significant in three specifications at the 5% level and once at the I% level. In contrast, education inequality has a changing effect on growth. In three times out of four significant specifications, the effect is positive. When only education inequality is entered, - without income inequality, (column 2) there is no significant effect. Including education and income inequality squared, produces mixed results (positive and negative coefficients), so there is no strong evidence for a relation of u-shape or inverted u-shape, but the small decrease of income inequality observed in Central Urban has a negative effect on growth. At the same time, the observed decrease in education inequality has a strong and significant positive effect on growth. The observed effects of changes in household heads employment sector, composition of household or family group or household age and initial education are very small.


**Table 2.12 Regression results - Asuncion** 

Notes: Absolute value oft statistics in parentheses.

Significant at 10%; \*\* significant at 5%; \*\*\* significant at 1 %.

Source: Author's calculations based on results of income estimates in Chapter 1.

Does inequality harm growth? 77

In the 2SLS regression the complete Remaining Urban model (regression l) looses quality. Adjusted R squared decreases from 0.909 (Table 2.14) to 0.836 (Table 2.9), and the models standard error increases from 0.012 (Table 2.14) to 0.016 (Table 2.9).

All variables except one have the expected signs. Even if the total number of individuals per household decreases during the observation period for the Remaining Urban area, we get a negative sign for this change, significant at 1 % level in all specifications.

Conditional convergence is pronounced in all specifications: the coefficient on initial income is negative, highly significant and has a value of approximately - 0.07 in all specifications. Income inequality has a significant effect only in three out of eight specifications; two of these three are negative. Education inequality has a significant negative effect in all specifications. The observed increase in income inequality has a negative effect in all specifications and the observed smaller increase in education inequality has a negative and significant effect in all specifications.

Positive effects of household heads education and age are still small but a little more important than in the Asuncion and Central Urban areas. Again, we have some evidence that female-headed households are better off regarding growth. For five out of 16 possible departments we find dummies with the expected signs regarding their overall economic performance. So sub-regional differences in growth performance exist, but their effect is considerably small.

In the 2SLS regression the complete Rural model (regression 1) looses quality. Adjusted R squared decreases from 0.867 (Table 2.15) to 0.467 (Table 2.10), and the models standard error increases from 0.015 (Table 2.15) to 0.034 (Table 2.10). All variables except one have the expected signs. Even if the total number of individuals per household decreased during the observation period for rural area, we get a negative sign for this change, significant at 1 % level in all specifications. Conditional convergence is pronounced in all specifications: the coefficient on initial income is negative, highly significant and has a value of approximately -0.08 in all specifications.

Income inequality has a changing significant effect (two times positive, two times negative). Education inequality has a significant positive effect in all specifications. The observed increase in income inequality has an important negative effect on growth in all specifications, as well as the increase in education inequality. Household heads age and education do not have important effects on growth. Female-headed households Thomas Otter - 978-3-631-75367-5 are better off regarding their growth capacities, as are households whose head is working in the commercial sector. Nevertheless, the positive effect of an increase in commercial employment, even if highly significant, ends up being very small.

For two out of 17 possible departments we find dummies with the expected signs regarding their overall economic performance. Consequently, sub-regional differences in growth performance exist, but their effect is considerably small.

Before running a separate fifth regression model on a sub-sample of panels for which pro-poor-growth has been determined, we checked on the veracity of this data (see Annex). About 97% of the sub-sample for pro-poor-growth is from rural areas. There are no spatial patterns, the Pro-Poor-Growth (PPG) panels are distributed all over the country, so PPG seems to be not the result of specific geographic area or any special districts, with better economic performance. It is a consequence of activities carried out by certain groups of people, permitting them to overcome part of their poverty. This phenomenon is observed in almost any part of the country (in 15 out of 18 departments and in 154 of the 224 districts).

If PPG is a consequence of group dynamics and not of spatial structures we should know more about these group characteristics. In all PPG panels the mother tongue is Guarani (indicator for low ethnical fragmentation), and 98.4% of the household heads have less than 5 years of education. The maximum geographic concentration is of 29 panel groups in the same district (2.4% of the sample). The 1300 identified PPG panel groups represent approximately 5% of all households and some 10% of poor households. The age distribution of PPG panel household heads follows the age distribution of all household heads.

In the 2SLS regression the complete PPG model (regression 1) looses almost all its quality. Adjusted R squared decreases from 0.601 (Table 2.16) to 0.087 (Table 2.11), and the models standard error increases from 0.017 (Table 2.16) to 0.049 (Table 2.11).

All variables except one have the expected signs. Even if the total number of individuals per household decreased during the observation period, for PPG subsample we get a negative sign for this change, significant at 1 % level in all specifications.

Conditional convergence is pronounced in all specifications: the coefficient on initial income is negative, highly significant and has a value of approximately - 0.06 in all specifications.

Income inequality has an important negative effect. By construction, this is to be expected at least if a household is poor. Education inequality has a positive effect in four out of nine specifications. The small decrease observed in income inequality has a negative and highly significant effect in all specifications. No significant effect is caused by the increase in education inequality. Income inequality squared produces significant positive effects in five out of six specifications, so there seems to be a u-shape relation. Only in one specification, letting out initial education inequality, education inequality squared produces a significant positive effect.

Household heads age and education, the number of children and the change in their number (small decrease observed) do have significant but very small effects on PPG in our case.


#### **fable 2.13 Regression results - Central Urban**


Notes: Absolute value oft statistics in parentheses.

Significant at I 0%; •• significant at 5%; \*\*\* significant at I%.

Source: Author's calculations based on results of income estimates in Chapter I.


**Table 2.14 Regression results -**


Thomas Otter - 978-3-631-75367-5 via free access

**•..;J** 


#### **Table 2.15 Regression results -Rural**

via free access


via free access

0


**Table 2.16 Regression results - Pro-Poor-Growth Panels** 

via free access

Absolute value oft statistics in parentheses. ;Significant at 10%; \*\* significant at 5%; \*\*\* significant at !%.;Source: Author's calculations based on results in Chapter I

### 2. 7 **Discussion**

The two most important findings of this study are that (1) income inequality does not necessarily have a negative effect on growth, but the observed decrease in income inequality in all models carried out harms growth; and (2) education (human capital) inequality has mixed effects on growth, depending on the initial level of education inequality. An increase in education inequality harms growth and a decrease in education inequality benefits growth. Furthermore, (3) in the Paraguayan case, the effects of changes in inequality are larger than the effects of inequality itself and ( 4) inequality effects and the effects of their change are bigger then family-group or employment sector effects. (5) There is almost no PPG in urban areas and in rural areas it is related to groups of individuals but not to geographical location. (6) A lower population growth (decrease in the total number of individuals per household) is negative for growth and (7) femaleheaded households are better off, regarding income growth. The first of these findings is mainly in line with cross-country evidence in Birdsall and Londono ( 1997), while the second result contrasts with findings in that paper, but supports findings by Schipper and Hoogeveen (2004).

This second point may appear somewhat counter intuitive at first sight: growth is enhanced when human capital (or access to it) of the household head is more unequally distributed. The key to understanding what is going on is the fact that we control for district mean level of education: this means that our conclusion is that *at a given mean level* of human capital, a more unequal distribution of this capital is good for growth. Nevertheless, there is some weak evidence in Paraguayan data that this is not true at any level, because for higher levels of education inequality in Paraguay its effect on growth is negative (Remaining Urban region) or tends to be negative (Central Urban region) but has positive effects on growth in Rural area (and by this on PPG). Elbers and Gunning (2004) show that our result is to be expected in a Ramsey growth model: under the condition that the production function is convex in human capital, a mean-preserving spread in human capital results in higher output growth. For instance, suppose we were to redistribute one year of education from someone with low educational attainment to someone who is reasonably well educated. This would make the distribution of human capital more unequal while keeping the mean constant. However, if the increase in output by the well-educated person exceeds the decline for the less well-educated person, then the increased spread in education has a positive effect on growth - as long as the mean level of education is kept constant.

Mean preserving spreads in human capital are not possible within a given population; they only exist in theoretical experiments Thomas Otter - 978-3-631-75367-5 or in the long run, that is, Downloaded from PubFactory at 01/11/2019 05:51:05AM

over generations. In reality, the mean level of education and inequality change simultaneously. In rural Paraguay, where a positive effect of education inequality on growth was found, education inequality- as measured by the Gini coefficient - has a negative correlation with the average level of education (see Figure 2.1 ). In theory, the implication of such a correlation is that, while raising the general level of education through policies like universal primary education will be good for growth; its positive effects will be partly offset by an expected associated decline in the education inequality. Nevertheless, for rural Paraguay the empirical evidence is that even if mean household heads education increased, education inequality also increased. This increase in education inequality harmed growth, even if the initial level of education inequality seems to have been an advantage. This evidence combined with results from different urban areas in Paraguay (where an education inequality higher than in rural area was harmful for growth) confirms the hypothesis that for a given level of inequality in relation to a given number of years of schooling, a higher level of education inequality can be a benefit, however, this is not that any higher level of education inequality has this same effect.

**Figure 2.1 District means of education and education inequality of household head in rural Paraguay (1992)** 

Source: Author's calculations based on results in Chapter 1.

The larger effects of changes in inequality compared with the effects of inequality itself on growth are consistent with Paraguayan macro-economic and business cycles history, as well as with its education politics during this business cycle. A decreasing growth and beginning recession reduces growth. For all three different urban areas, annual growth rates of per capita income are negative, while the rural rate is positive but small. At the same time, an increase in education was driven by an education reform that started in Paraguay in 1994, producing a decrease in education inequality only in the Asuncion and Central Urban Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM

via free access

areas. Within a context of economic recession, finally these effects happen to be stronger than most of the observed changes inside families or regarding employment opportunities.

The PPG related to groups of individuals and not to geographic location indicates that the sort of PPG we observe in Paraguay is related more to opportunities and less to structural changes or other effects. As a matter of fact, there were few structural changes in the rural economy in Paraguay over the period observed. Most of the rural PPG opportunities could be related to new cash crop farming and their export, such as sesame and some varieties of organic cotton. Unfortunately our census database cannot link the empirical evidence with the production sector data, since we are working with a pseudo panel and not pure household micro-data.

Finally, a decrease in the total number of individuals per household ( consistent with fertility decrease during this period) is negatively related to growth, even if a decrease in the number of children per household is not. In a way, we could consider that less people per household in general equals a lower working force and a lower capacity of generating income. On the other hand, if annual income growth rates are negative, less people in a household should impulse an increase in per capita income. This is possibly a spurious relation, because per capita income and number of people per household decreased simultaneously.

Rethinking all these results from an income-mobility point of view; remember that the initial level of income can be understood as a proxy for family background and initial education level as a proxy for institutional opportunities to develop talent. Also remember that initial income inequality was considerably high and slightly decreasing during the observation period, while education inequality was lower and moved in different directions for the different regions. If initially higher levels of income facilitate upward income mobility and a higher income inequality benefits that process, we should expect that this effect benefits a more middle class kind of household. If inequality supports growth, it is easier to grow, but at the same time more difficult to reduce poverty, which in the end would be a strong income growth for low-income groups. Remember that in our results, initial levels of education have almost no effect on growth and education (human capital) inequality has mixed effects on growth, depending on the initial level of education inequality, with an increase in education inequality that harms growth and a decrease of education inequality that benefits growth. If so, the best combination for upward income mobility would be a high level of initial income in an area with high-income inequality and an institutional capacity to widen education opportunities in a way that education inequality decreases. This combination can be found in the PPG sub panel (97% rural area) and for Central Urban area. Nevertheless, in both areas, Thomas Otter - 978-3-631-75367-5 poverty increased during the observation period. Consequently, in Central Urban area, it might have been a middle class phenomenon and for PPG panels and rural area, even if there were positive growth and income mobility effects, they may not have been strong enough to get households out of poverty and there may have been some downward mobility as well.

### **2.8 Conclusions**

We estimated the effect of income and education inequality on growth, using imputed data on income inequality and growth for small administrative units in Paraguay (districts), along with census data for education inequality; all this based on a pseudo panel data set. Carrying out this kind of analysis for a specific country has important benefits. First, it avoids data comparability problems that typically affect cross-country growth regressions. Moreover, by identifying the effects of inequality on growth for a given country, country specificity is taken into account. This enhances the relevance of our results for local policy makers.

In the empirical section we adjusted the standard errors of variable coefficients for the fact that some regressors are imputed; in our case initial income levels and income inequality, and therefore associated with a standard error. The adjustments are considerable; they typically increase standard errors from a factor 0.5 up to 22, using five different models for different areas or groups of households. Our models are not alone in using imputed variables. Most growth regressions do so by relying on GDP or survey based inequality estimates, for instance. This puts into question the significance of some of the inequality and growth results reported elsewhere.

Our results show for rural Paraguay that higher levels of education inequality enhance growth. Controlling for the level of educational attainment, larger variation in education is here good for growth. The latter finding is plausible if the production function is convex in ability, something that can be illustrated with a Ramsey type household growth model. Nevertheless, we find opposed results for urban areas, where education inequality is higher. Our results also show that higher income inequality does not have a uniform effect on growth (it tends to be more harmful in larger urban areas) and that effects of changes in inequality on growth are larger than the effect of inequality itself, this is for both, education and income.

What does this mean for policy in Paraguay? If policymakers are mostly interested in growth, they should be more concerned on income inequality in urban areas and on education inequalities in rural Paraguay. Income inequality is an important issue for income growth in urban Thomas Otter - 978-3-631-75367-5 areas (and more important in the Asuncion and Central Urban areas), in a consistent way with the rapidly increasing urban poverty. Fighting urban poverty must consider income inequality. At the same time, the impact of income inequality in rural areas is much less of a problem. Also, education inequality is a greater problem in urban areas, but politics seem to be on track with a certain success of targeting urban education services, since urban education inequality tends to go down, which benefits income growth. For rural areas, the problem is more sophisticated. Even if initial education inequality benefited rural income growth, a badly targeted or non-universal policy implementation of education reform in rural area, increased education inequality, which in theory harms growth. If, for intrinsic reasons or otherwise, policy makers are interested in reducing education inequality, our results suggest that this would damage growth, but only if the policy was pursued by keeping the mean level of education constant. In practice a policy aimed at reducing inequality in education will almost always be mean increasing.

Finally, even if the poverty map exercise which preceded this paper suggested that there are important spatial effects on poverty levels, we did not find spatial effects for a PPG evidence, which seems to be more of a result of individuals and group dynamics and access to (labour and employment) opportunities. For politics this should mean that there is no need for a special growth strategy for special areas in the country as long as there will be new opportunities for almost all of the working force and not only opportunities for a few (which would increase income inequality).

# **Chapter 3**

# **Characterization of inequality changes through microeconometric decomposition**

### **Paraguay 1992 - 2005**

### **3.1 Introduction**

The main economic variables have oscillated widely during the 1992 - 2005 period in Paraguay, in association with some macroeconomic and structural transformations, but also following general growth trends and business cycles in the South American region. This can be separated into three sub-periods; 1992 to 1998, 1999 to 2002 and 2003 to 2005.

During the early eighties, the Paraguayan economy benefited from high public investment rates resulting from the construction of the Itaipu and Yacyreta hydro-electric power plants. The country made its own way of stability and growth during a period of hyperinflations and external debt crisis in many South American countries. Nevertheless, its economy fell into a growth crisis (still avoiding debt crisis and hyperinflation) during the second half of the eighties, once the construction period of the hydroelectric power plants came to an end. During the first half of the nineties, Paraguayan economy recovered from recession, now driven by agricultural production and a re-export business boom, based on special arrangements for duty rates for electric and electronic equipment imported to the MERCOSUR (Mercado Comun del Sur - regional free trade agreement established in 1991 by Argentina, Brazil, Paraguay and Uruguay) region via Paraguay. The agricultural success was based on a recovery of international cotton prices, combined with a successful cotton extension program for small farmers within the country and a quick and widespread expansion of mechanized soybean farming. The commercial success with electric and electronic components was based on re-export. Paraguayan import duty rates from outside MERCOSUR were so low, that Brazilian and Argentine enterprises would prefer to buy these products re-exported from Paraguay, rather than importing themselves from outside MERCOSUR, which would have meant higher duty rates. Before this background, Paraguayan GDP per capita grew until 1995 and then remained relatively stable until 1998. The per capita income26 Gini coefficient fell from 55.8 to 54.0. Mean per capita growth was 0.63% and poverty dropped from 38.2% to 32.1 %.

<sup>26</sup> Including all kinds of income, Jabour income, non-labour income and imputed values for own housing.

The period between 1999 and 2002 saw great political instability. Weak and unconcluded structural reform processes in the economy, which had begun in the nineties, were terminated. Small-scale cotton farming entered a deep crisis due to falling international prices and considerable parasite problems, combined with adverse climatic conditions (El Nino phenomenon), affecting agriculture in general. External shocks such as the Brazilian devaluation during global finance crisis and the Argentine default strongly hit the country. Per capita growth was - 2.6% and poverty leaped to more than 46%. Gini coefficient for income inequality ascended to 56.1. As from 2003, political changes brought the country back to a more stable course. A tax reform and institutional improvements in Government provided more and new revenues to the treasury. Public expenditure, including social expenditure went up. Economy was benefited by a regional recovery. In the production sector, this period is marked by an important growth of livestock and meat exports. Per capita GDP grew 2.0% on average, while poverty went down to 38.2% and Gini coefficient to 52.8.

However, the reasons behind these changes in inequality are more varied and complex than just a macroeconomic history could tell. The main purpose of this paper is to assess the relevance of some forces that are believed to have affected income inequality in Paraguay between 1992 and 2005. More specifically, the microeconometric decomposition methodology proposed by Bourguignon, Ferreira and Lustig (1998) has been used to measure the relevance of various factors, which appear to have driven changes in inequality. In particular, this methodology has been used to identify to what extent changes in the returns to education and experience, in the endowments of unobservable factors (such as individual's innate ability) and their returns, in the wage gap between men and women, in labour market participation and hours of work, and in the educational structure of the population contribute to explain the observed changes in income distribution.

The results of this paper suggest that the smaller change in inequality between 1992 and 1997 /98 is mainly as a result of employment (including hours of work) and education effects, characterized by a primary schooling expansion. The larger inequality reduction effect after 1997 is due to returns to education, hours of work (since unemployment increased) and unobservable factors.

The rest of the paper is organized as follows. Section 2 presents the decomposition methodology implemented to assess the relevance of those factors. Section 3 shows the basic facts and discusses some factors that may have affected inequality during the last two decades, while section 4 explains the estimation strategy. The main results of the analysis are presented in section 5. The paper concludes with some brief final comments in section Thomas Otter - 978-3-631-75367-5 6.

### **3.2 Methodology**

Many different forces exist behind the long-run changes in income distributions or, more generally, distributions of economic welfare, within a population. Some of these forces have to do with changes in the distribution of factor endowments and socio-demographic characteristics, while others have to do with the returns these endowments produce and others with changes in populations' behaviour such as labour supply, consumption patterns or the decision on whether or not to have children. These forces are not independent from each other. This is what makes it difficult to precisely identify fundamental causes and mechanisms behind the dynamics of income distribution. Decomposition techniques are used to identify causes of distributional changes. Traditional techniques explain differences in scalar summary measures of distributions rather than in full distributions. The best known of these techniques is the Oaxaca-Blinder decomposition of differences in mean incomes across population groups with different characteristics (Blinder 1973; Oaxaca 1973) and the variance-like decomposition property of the so-called decomposable summary inequality measures (Bourguignon 1979; Cowell 1980; Shorocks 1980). In both cases, the underlying logic is that the aggregate mean income (or inequality measure) in a population is the result of the aggregation of various socio-demographic groups of income sources. Thus, changes of overall mean or inequality measure can be explained by identifying changes in the means and inequality measures within those groups or income sources, and in their weights in the population or in total income.

The new focus on poverty and inequality reduction, which increasingly drives development policy, currently requires new analysis techniques on the shape of distribution, for example, in the vicinity of and below the poverty line. In terms of the Oaxaca-Blinder approach, the issue is to know not so much whether mean earnings are lower for women then for men, since the former may have less average education, as whether the differences are greater or smaller for the bottom part of the earnings distribution. Answering this kind of questions requires handling the whole distribution rather than summary measures. To assess the relevance of the various factors on income inequality changes, handling whole distributions, a microeconometric decomposition methodology first proposed by Bourguignon, Ferreira and Lustig (1998) was tailored to the Paraguayan case.<sup>27</sup>

<sup>27</sup> Variants of the basic methodology have been applied in Altimir, Beccaria and Gonzalez Rozada (2000), Bourguignon, Gurgand and Fournier (1999), Bouillon, Gasparini, Marchionni and Sosa (2000), Legovini and Lustig (1998) and Ferreira and Paes de Barros (1999), amongst others. Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM via free access

### *The basic model*

The decomposition of a distributional change essentially consists of contrasting representations of the income-generation process ( evaluating differences in estimated parameters) for two different distributions (two points in time) on the one hand, and accounting for changes in the joint distribution of endowments, on the other. Bourguignon, Ferreira and Lustig (1998) use parametric representation of inequality changes, because the parameters lead themselves directly to relevant economic interpretations.

More formally, a parametric representation of an income generation process can be defined by a set of variables *X* = ( *V, W)* where specific combinations of individual characteristics *V* and the values of these individual characteristics Ware defining groups. A general parametric representation of the conditional functions *g' (ylV, W)* and *h' WIW)* relates *y* and *(V, W)* on the one hand, and *V* and *Won* the other, according to some predetermined functional form. These relationships can be denoted as follows:

$$\mathbf{y} = \mathbf{G}[\mathbf{V}, \mathbf{W}, \varepsilon; \boldsymbol{\Omega}\_{\varepsilon}] \tag{3.1}$$

$$\mathcal{V} = H[W, \eta; \Phi\_r] \tag{3.2}$$

Where n, and ct>, are sets of parameters and *e* and 77 are random variables - 77 is a vector if *Vis* a vector. These random variables play a similar role to the residual term in standard regressions. They are meant to represent the dispersion of income *y* or individual characteristics *V* for given values of individual characteristics ( *V, W),* and *W,* respectively. They are also assumed to be distributed independently of theses characteristics, according to density functions *1r'* and *µ'.*  The functions *G* and *H* have pre-imposed functional forms.

If this model were to be applied to the distribution of individual earnings, the methodology would be rather simple. Ignoring the partition of *X* into exogenous characteristics *(W)* and non-exogenous individual characteristics (V), a simple parametric representation of individual earnings as a function of individual characteristics is given by:

$$\mathbf{L}\log\mathbf{y}\_t^t = \mathbf{X}\cdot\mathbf{\Omega} + \mathbf{s} \tag{3.3}$$

In this particular case, the function of G( ) is thus as follows:

$$G(\mathcal{X}, \mathfrak{s}; \Omega) = e^{\mathcal{X}\cdot\Omega + \mathfrak{s}} \tag{3.4}$$

To obtain estimates for the set of parameters n and for the distribution of the random term&, one may rely on standard econometric Thomas Otter - 978-3-631-75367-5 techniques. Running a regression on samples of the observations *i* available at timer ,

$$\text{Log } \mathbf{y}\_t^\tau = X\_t^\tau \cdot \boldsymbol{\Omega}\_\mathbf{t} + \mathbf{s}\_t^\tau \tag{3.5}$$

yields an estimate of the set of parameters nr, as well as of the distribution ,rr of the random term. Then, counterfactuals *D* can be computed easily. Without the ( *V, fV)* distinction, a counterfactual is defined as *D(x, tr;* n) , where *x(W, 17)*  is the joint distribution of the exogenous components of *(V,* W). In discrete representation {y;}' = *(Y1>Yi,··,YNu)* of the distribution at timer, where *Nr* is the number of observations in the sample available at time r = *t,t',* it is identically the case that

$$D(\mathcal{X}\_t, \pi\_t \Omega\_t) = \{\mathcal{Y}\_i\}^\prime. \tag{3.6}$$

The counterfactual, *D(x,* ,.ir, ,n,.) = {Y; }:;-''', is obtained by computing:

$$\mathbf{L} \otimes (\mathcal{Y}\_{i})\_{\Omega}^{\iota \to i'} = X\_{i}^{\iota} \cdot \hat{\mathbf{D}}\_{i'} + \hat{\mathbf{z}}\_{i}^{\iota} \qquad \text{ for } i = 1, 2, \dots, N\_{\iota} \tag{3.7}$$

where the notation /\ stands for OLS estimates. The counterfactual is thus obtained by simulating the preceding model on the sample of observations available at time *t.* This simulation shows what would have been the earnings of each individual of the sample if the returns to each of the observed characteristics had been those observed at time *t'* rather than the actual returns at time *t.* The returns to the unobservable characteristics that may be behind the residual term *c:* are supposed to be unchanged, nonetheless. This is equivalent to the evaluation of the price effect for observed characteristics of the Oaxaca-Blinder calculation. The difference is that the evaluation is carried out for every individual in the sample. The counterfactual of the distribution of the random term *D(x, ,,rt'* ,n,) = {Y; };;--1 • is a little more difficult to construct. Importing the distribution of residuals from time *t'* to time *t* requires an operation known as *rankpreserving-transformation,* whereby the residual in the *nth* percentile (of residuals) at time *t* is replaced by the residual in the *nth* percentile at time *t',* for all *n.*  As the operation is not immediate when the number of observations is not the same in the two samples, an approximate solution is used. It consists of assuming that both distributions of residual terms are the same up to a proportional transformation. An example would be if residuals were normally distributed, with mean zero. The rank-preserving-transformation is then equivalent to multiplying the residual observed at time *t* by the ratio of standard deviation at time *t'* and *t. D(x,,1r1.,n,)* = {y;}~•,· is thus defined by:

$$\begin{aligned} \text{Do\\_Log}\begin{pmatrix} \mathbf{y}\_i \end{pmatrix}\_{\pi}^{\prime \to \pi^i} &= X\_i^{\prime} \cdot \hat{\mathbf{D}}\_i + \hat{\mathbf{e}}\_i^{\prime} \cdot \left( \hat{\mathbf{o}}\_i^{\prime \prime} / \hat{\mathbf{o}}\_i^{\prime} \right) \qquad \text{for } i = 1, 2, \dots, N, \text{ (3.8)}\\ \text{Thomass } \text{Other\\_978\text{-}3-63\text{--}7536\text{\textdegree -}5} \\ \text{Do\\_woland}\text{added from } \text{Pub\\_auto} \text{ at } 01\text{\textdegree 11\text{-}2019 } 05.51 \text{\textdegree 05\text{-}M} \\ \text{ via free access} \end{aligned}$$

With those counterfactuals at hand, estimates for the contribution to the observed overall distributional change between *t* and *t'* of the change in the n parameters, in the distribution of residuals ( *1r* ), and possibly even of these two changes taken together, may easily be found. The effect of changing the distribution of individual endowments, *X,* is obtained as the complement of the two previous changes:

$$(\{\boldsymbol{y}\_{i}\}^{\prime} - D(\boldsymbol{\chi}\_{i}, \boldsymbol{\pi}\_{i}, \boldsymbol{\Omega}\_{i}).\tag{3.9}$$

### *Adaptation to Paraguayan data*

Let *Yit* be individual's *i* labour income at time *t,* which can be written as a function *F* of the vector *Xit* of individual observable characteristics affecting wages and employment, the vector *eu* of unobservable characteristics, the vector *b,* of parameters that determine market hourly wages and the vector /1 of parameters that affect employment outcomes (participation and hours of work).

$$Y\_{it} = F(X\_{it}, \epsilon\_{it}, \beta\_t, \lambda\_t) \text{ i} = 1, \ldots, N \tag{3.10}$$

The distribution of individual labour income can be represented as:

$$D\_l = \{Y\_{Il}, \dots, Y\_{Nl}\} \tag{3.11}$$

We can simulate individual labour incomes by changing one or more arguments in equation (3.10). For instance, the following expression represents labour income that individual's *i* would have earned in time *t* if the parameters determining wages had been those of time *t* ', keeping all other things constant.

$$Y\_{\acute{i}t}(\beta\_{\acute{t}'}, \cdot) = \mathcal{F}\left(\mathcal{X}\_{\acute{i}t}, \epsilon\_{\acute{i}t}, \beta\_{\acute{t}'}, \lambda\_{\acute{t}}\right) \mathrel{:=1,\ldots,N} \tag{3.12}$$

More generally, we can define *Yit(kt')* where *k* is any set of arguments in (3.10). Hence, the simulated distribution will be:

$$D\_{\mathcal{I}}\left(k\_{\mathcal{I}}\circ\right) = \left\{ Y\_{\mathcal{I}}(k\_{\mathcal{I}}\circ), \ldots, Y\_{\mathcal{I}}(k\_{\mathcal{I}}\circ) \right\} \tag{3.13}$$

The contribution to the overall change in the distribution of a change in *k* between *t* and *t',* holding all else constant, can be obtained by comparing (3.11) and (3.13). Although we can make the comparisons in terms of the whole distributions, in this paper we only compare inequality indices */(D).* Therefore, the effect of a change in argument *k* is defined by:

$$I\left[\left(\mathbf{D}\_{l}(\mathbf{k}\_{l})\right)\right]\cdot I\left(\mathbf{D}\_{l}\right)^{\mathfrak{A}}\tag{3.14}$$

28 In the empirical implementation labour income distribution only is computed Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM via free access

Characterization of inequality changes 99

This paper is devoted to discuss thq,jollowing effects:


The previous discussion refers to the distribution of individual earnings. However, it is more relevant from a social point of view to study the distribution of household income since a person's utility usually depends not on their own earnings, but on their household income and demographic composition. Following Buhmann et al. (1988), equivalent household income is given by:

$$Y\_{ih}^q = \sum\_{j\epsilon h} (Y\_{j\nu} + Y\_{j\nu}^0) / \left(\sum\_{j\epsilon h} a\_j\right)^{\Theta} \qquad \qquad i = 1, \dots, N \tag{3.15}$$

where *Y'1* stands for equivalent household income, *h* is the household, *r>* is the income from other sources, *a* is the equivalent adult and *q* captures household economies of scale. The distribution of equivalent household income can be expressed as:

$$D\_l = \{ Y^q{}\_{Il}, \dots, Y^q{}\_{Nl} \} \tag{3.16}$$

Changing argument *k* to its value in *t'* yields the following simulated equivalent household income in year *t:* 

$$\begin{array}{c} Y\_{\mathsf{A}\mathsf{c}}(k\_{i}) = \sum (Y\_{\mathsf{A}}(k\_{i}) + Y\_{\mathsf{A}}^{\mathsf{0}}) / (\sum a\_{j})^{\mathsf{0}} \\ \text{Hence, the simulated } \mathsf{eff{sktribation is:}} \text{ is:} \end{array} \qquad \qquad i = 1, \ldots, N \tag{3.17}$$

$$D^q\_{\,t}(k\_{\,t}\,\prime) = \{\,\,\,Y^q\,\,\_{\,It}(k\_{\,t}\,\prime),\,\ldots,\,\,Y^q\,\,\!\_{Nt}k\_{\,t}\,\prime\}\}\tag{3.18}$$

The effect of a change in argument *k,* holding all else constant, on equivalent household inequality is given by29:

$$I[\mathcal{D}^{\mathfrak{g}}f(\mathbf{k}\_{\mathfrak{f}}\circ)] \cdot I(\mathcal{D}^{\mathfrak{g}}).\tag{3.19}$$

### **3.3 Income inequality in Paraguay: basic facts and sources of change**

Per capita income inequality in Paraguay during the nineties has a generally negative tendency, rising during economic and political crisis between 1999 and 2002 and then recovering its path towards reduction. Figure 3.1 shows the Gini coefficient of per capita household income between 1992 and 2005 in Paraguay, combined with poverty headcount measures and GDP per capita.30 Only since 2001 is there a yearly update of poverty and inequality measures in Paraguay.

**Figure 3.1 Gini coefficient of Per Capita Household Income Distribution, Poverty and GDP per capita in Paraguay, 1992 - 2005** 

Source: Author's calculations based on EPH and EIH of National University and DGEEC for poverty and inequality. GDP data from the Banco Central del Paraguay.

For simplicity, this study focuses on three years of relative macroeconomic sta-

<sup>29</sup> In the emRirical implementation we ignore income from other than labour sources *Yjt* and we consider all individuals such that f;; ~ O and f;,9 *(k,.)* ~ 0 .

<sup>30</sup> The 1992 survey was carried out by the Universidad Nacional de Asuncion, while the 1995 to 2005 surveys are Encuestas Permanentes de Hogares (EPH) or Encuestas Integrales de Hogares (EIH, only 1997/98 and 2000/01) carried out by the National Statistical Office (Direccion General de Estadistica, Encuestas y Censo - DGEEC). Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM

bility separated by almost equal intervals: 1992, 1997 /98 and 200531 • The analysis was restricted to labour income mainly for two reasons. (i) Permanent Household Surveys (EPH) and Integrated Household Surveys (EIH) have various deficiencies in capturing capital income, and (ii) modelling capital income and retirement payments is not an easy task, especially considering the scarce information included in the surveys. Households whose heads or spouses are older than 64, or receive retirement payments, were ignored. The following analysis concentrates on the distribution of individual labour income32 and on the distribution of equivalent33 household labour income.



Table 3.1 shows the basic facts to be characterized in the paper: inequality in individual labour income and in equivalent household labour income, as measured by the Gini, dropped almost ten percentage points between 1992 and 2005. In-

terestingly, labour income inequality reduction is stronger than equivalent household labour income reduction. Reduction between 1997 /98 and 2005 is much stronger than between 1992 and 1997 /98, even if the first period contains a sub-period of economic, political and social crisis where inequality grew. One possible reason for the stronger decrease of inequality in labour income, compared with inequality in household income, may lie in changes in non-labour income sources. Main non-labour income sources in Paraguayan income survey measures are the imputed value on own housing and transfers from family members inside and outside the country. Mainly the poor are benefited from imputed values for own housing, no matter how precarious their housing might be. There is an underdeveloped market for renting houses or apartments. It is nearly exclusively an urban phenomenon, restricted to rich households that can afford to pay

<sup>31 1992</sup> and 1995 surveys report income for September, EIH 1997/98 for February of 1998, 1999 for September, 2000/01 for March 2001 and starting from 2002 all incomes for November.

<sup>32</sup> Labour income comprises wage earnings and self-employed earnings.

<sup>33</sup> Following Buhmann et al. (1988) the equivalent household income is obtained by dividing household income by the number of equivalent adults raised to 0.8, a parameter which implies mild household economies of scale. Since there is no official measurement of equivalent adult scales from DGEEC in Paraguay, general scales with 0.4 for children < *5* years of age, 0.5 for children > *5* years of age, and < 16 years of age and 1.0 for all individuals > 14 years of age were applied. Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM

rent. Cash transfer from private sector to Paraguayan households came mainly from Argentina during the nineties and since 2003 with an increasing degree from Spain and the US. Private sector cash transfers are distributed all over society, from very poor to very rich. In 1992 private cash transfers were 1.8% oftotal household income, 4.3% in 1997/98 and 4.7% in 2005. Other possible sources for differences in changes between labour and household income are changes in marriage markets or changes in household composition (total number of individuals per household). Regression results in the next chapter will give some hints on these points. There are almost no public cash transfers, apart from very small pension payments, but pension recipients were excluded from the analysis by definition. An innumerable number of factors may have caused the changes in inequality documented in Table 3.1. We will concentrate on seven of these: (i) returns to education, (ii) the gender wage gap,34 (iii) returns to experience, (iv) the dispersion in the endowment of unobservable factors and their returns, (v) hours of work, (vi) labour market participation, and (vii) the education of the working-able population.

### **3.3.1 Returns to education**

An increase in the returns to education implies a widening of the wage gap between high and low educated workers, which in tum would imply a more unequal distribution of individual earnings and probably a more unequal distribution of household income. Table 3.2 shows hourly earnings in constant Guaranies (Gs.) in 2005 for workers between 12 and 64 with valid and complete answers. The average wage increased 5.3% between 1992 and 1997/98 and dropped 7.8% during the next seven years. Changes were not consistent among educational groups. In the first period of the analysis we had winners and losers. While incomes for workers who had not finished primary education increased slightly, the wages for the next two groups, complete primary education and incomplete secondary education, dropped considerably. Dramatic increases were observed for complete secondary and complete or incomplete college education. In the 1997 /98 - 2005 period the losses of income were generalized for all income groups except for primary incomplete education. Losses for higher education are stronger than for lower education. Table 3.2 is a first piece of evidence that changes in relative wages among schooling groups implied an increase in earnings inequality between 1992 and 1997/98 and a decrease thereafter.

<sup>34</sup> Throughout this paper "wage" refers to hourly labour income earned by wageworkers and self-employed workers. Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM via free access


**Table 3.2 Hourly Earnings by Educational Level in Paraguay, Selected Years** 

Source: Author's calculations based on EPH and EIH of National University and DGEEC.

Table 3.3 shows the results of Mincerian log hourly earning functions estimated using the Heckman procedure to correct for sample selection. The first three columns refer to household heads (mostly men) and the rest to spouses (nearly all women) and other members of the family (roughly half men and half women), respectively. A gender dummy, age and age squared and a dummy for youngsters less than 18 years old (only relevant for other members) are included in the regression. In addition to these variables, the selection equation includes marital status, number of children and a dummy that takes the value **"1"** when the individual attends school. Following Bourguignon et al. (1999) it is assumed that labour market participation choices are made within the household in a sequential fashion. Spouses take the head's labour market status into consideration to decide whether to enter the labour market or not. Other members of the family consider both the head and the spouse labour market status.

The coefficients for years of education are positive and returns to education are always positive. For family heads in 1992, one additional year of schooling increases in the mean hourly wages in 11.3%, keeping all other factors constant. The same figure for 1997/98 and 2005 is 15.4% and 10.9%, respectively. It is interesting to observe that spouses hourly wage determination follows the same path of the heads hourly wage. It also increases (from 7.5% to 13.2%) between 1992 and 1997/98 and then drops again (to 10.7%) in 2005, but there is no such path for other family members who lose income in each period. Figure 3.2 shows the predicted hourly earnings for all different years of education. The first panel refers to male heads and the second to other male members, both with age kept constant at 40.


**Table 3.3 Lo~-Hourly Earnings Equation Applied to Paragu\_a\_y\_, Selected Years** 

via free access



g

0 ..,. 

Source: Predicted hourly earnings from models on table 3.3.

Source: Predicted hourly earnings from models on table 3 .3.

The wage-education profiles for family heads have a marked positive slope and are almost parallel everywhere, except for the substantial increase in the slope of 1997 /98 in the highest levels, as from 13 years of education. This certainly contributes to increase earnings inequality among household heads with different educational levels. For male other-members the wage-education profile we have almost parallel slopes for all periods with some differences only for 17 and 18 years of education. So the changes in earnings of other family members could contribute only for high levels to widen inequality.

**Figure 3.3 Hourly Earnings-Education Profiles for Women (Spouses), Age40** 

Figure 3.3 shows the profiles for 40 year old females. As in the case for men, the wage-education profiles show an increasing slope between 1992 and 1997 /98, and an opposite movement between 1997/98 and 2005. It is interesting to see that for all three groups (household heads, spouses and other family members) there is a strong increase in returns for higher education in the 1992 - 1997 /98 period. This is a real observation and unbiased by model specification, since years of education did not enter the model in its squared from. The reason for high returns for higher education might lie in pure market effects. Even if the Paraguayan economy and its industry are not very sophisticated, there is still a need for highly qualified human resources in any managerial post. As Table 3. **l** 0 will show further ahead, the percentage of the working force which completed college education did not exceed 2.4% over the whole observation period. Incomplete college education increased from 4.6% in 1992 to 9.4% in 2005, but nevertheless, these levels still remain low, and in a way, can explain why 2005 return profiles are much "smoother" than in previous years.

Summarizing, there is evidence of a positive relationship between hourly earnings and education which induces differences in incomes among individuals with different education. According to the evidence presented these differences, along with inequality, have increased between 1992 and 1997 /98, and decreased in the next seven years. During this last period the wage-education profile has become smoother and less convex, which implies inequality reduction. Although this phenomenon seems widespread across groups, it appears to be more relevant for the groups of household heads and spouses. Thomas Otter - 978-3-631-75367-5

Source: Predicted hourly earnings from models on table 3,3,

### **3.3.2 Gender wage gap**

Table 3.4 presents mean hourly wages by gender. Wages were higher for males in every year. Nevertheless, there are interesting dynamics within the gender wage gap which decreased from more than 16% in 1992 to less than 2% in 1997/98, and then increased again to some 6% in 2005. Over the whole period, female mean wage gain was about 6.9%, while male wages increased only in 0.6%. This implies inequality reduction over the whole observation period.


**Table 3.4 Hourly Earnin2s by Gender in Para2uay, Selected Years** 

Source: Author's calculations based on EPH and EIH of National University and DGEEC.

A conditional analysis also shows a shrinking gap for household heads. From Table 3.3 the coefficient for the male dummy is not always positive and significant, but clearly decreasing over time. Surprisingly, for other members we observe an important male income loss in 1997 /98. However, since the number of working individuals in this group is considerably less than in the household heads group, the global conclusion of a narrowing gender wage gap holds. This shrinking gap has undoubtedly been an equalizing factor on the individual earnings distribution. The effect of this phenomenon on the equivalent household labour income distribution will basically depend on the position of working women in that distribution. Section 5 will expand on this further ahead.

### **3.3.3 Returns to experience**

Age is used in this paper as a proxy for experience in the labour market. Table 3.5 shows average hourly earnings for different age groups. In general the wageage profile has an inverted U shape. Between 1992 and 2005, hourly wages only increased for labour force younger than 30 years of age, which is the worst paid group of workers. In principle, this would imply an equalizing effect on the earnings distribution. However, the main benefit is for men and women less than 20 years of age. In 1992 they represented less than 13% of the total working population. During the 1997 /98 to 2005 period these gains were lost again. All age groups lost income considerably in the 1997 /98 to 2005 period, with a stronger loss for young workers. Once more, since this group is small, its effect is not big on the overall distribution. Since all other age groups lost more or less similar percentages of their wage there seems to be only a very small, but in the end positive equalizing effect. Thomas Otter - 978-3-631-75367-5


**Table 3.5** 

Source: Author's calculations based on EPH and Elli of National University and DGEEC.

Throughout the whole observation period, the age group between 20 and 29 years of age is the less affected one. This implies certain equalizing effects, since in 2005 this group represented almost 40% of the working force. More negatively affected groups are those for working force older than 40 years of age. Nevertheless, in 2005 all three of these groups together represented less than half of the working force ( 48% ). Since their mean wages are lower than the mean wages of the largest age group (20 to 29), their inequality increasing effects should be lower than the equalizing effects of the 20 to 29 year old group. Summing up, there are some reasons to believe that changes in the returns to experience have led to higher inequality and some reasons to believe the opposite. The analysis of Section 5 will help us to assess the quantitative relevance of each argument.

### **3.3.4 Unobservable factors**

Earnings equations allow the estimation of returns to observable factors like education and experience. The error term is usually interpreted as capturing the joint effect of the endowment of non-observable factors (like individual ability) and its market value on earnings. In general terms, the variance of this error term captures the contribution of dispersion in unobservable factors to general inequality. Table 3.3 reports the standard deviation of the error terms of each log hourly earnings equation (labelled as "sigma"). For instance, for household heads the standard deviation took a value of 0.86 in 1992, 1.01 in 1997/98, and 0.89 in 2005. The substantial increase between 1992 and 1997/98 is also present in the spouses and other members' equations, as well as the reduction towards 2005. According to these results, the effect of changes in unobservable factors would have been strongly unequalizing between 1992 and 1997 /98, reducing some of this additional inequality in the 1997 /98 to 2005 period.

### **3.3.5 Hours of work**

During the period under analysis there has been an increase in weekly hours of work between 1992 and 1997/98 and a decrease in the next seven years, almost to the same overall level observed in 1992. Table 3.6 classifies workers by educational level and records the average hours of work of each group. While there are clear gains for completed cycles in the 1992 to 1997/98 period, which deepens inequality, losses in 1997 /98 to 2005 are more equally distributed. So, over the whole period we still observe important gains for completed cycles of secondary and college education, but at the same time important losses for groups who did not complete an education cycle. Since gains are larger for higher educational groups, this change would have a non-negligible unequalizing effect in the individual earnings distribution. A conditional analysis yields similar results. Figure 3.4 shows predicted weekly hours of work for male heads from the Tobit censored data model presented in Table 3.7. While hours clearly increased between 1992 and 1997/98 for less-educated (1 to 6 years of education) and for well educated (more than 13 years of education) male heads workers, changes in hours for the rest of the educational groups were only marginal between 1992 and 1997 /98. In the


**Table 3.6 Weekly Hours of Work by Educational Levels in Paraguay, Selected Years** 

Source: Author's calculations based on EPH and EIH of National University and DGEEC.

**Figure 3.4 Weekly Hours of Work by Educational Level for Men (Heads of Household), Age 40** 

Source: Predicted hourly earnings from models on table 3.7.

following seven years, the reduction of weekly hours of work is generalized for levels above six years of education. Consequently, we have evidence over the whole observation period for an equalizing effect for workers between 1 and 6 years of education, an unequalizing effect for workers between 7 and 15 years of education and a nearly neutral effect for the highest education levels.

### **3.3.6 Labour market participation**

Household income inequality can change, not only after changes in hours of work but also as a result of changes in labour market participation. In Table 3.7 individuals are grouped according to whether they are employed, unemployed or inactive. The percentage of unemployed individuals dropped from 4.4% in 1992 to 3.4% en 1997/98 and rose again to 3.8% in 2005. However, notice that the increase in unemployment between 1997/98 and 2005 in 0.4 percentage points was accompanied by a decrease in inactivity in 3.2 percentage points. Notice that for inequality measures it is irrelevant whether an individual has zero income as a result of unemployment or due to not looking for a job. Hence the important indicator for possible inequality changes is the overall employment rate which increased from 57% to 61% and 63% during the observation period. These changes might have played a role in inequality changes depending on the distribution of wage levels and hours of work obtained by the additional working force.


### **Table 3.7 Labour Status by Household Role, Paraguay, Selected Years**

Source: Author's calculations based on EPH and EIH of National University and DGEEC.


via free access

(")

**el** 

**:r** 

Table 3.7 suggests three different stories in the labour market for heads, spouses and other members. Some household heads lost or quit their jobs, especially during the last seven years, becoming either unemployed or leaving the labour force. In contrast, many (30 percentage points) of the spouses left their homes in search of a job: most of them found one between 1992 and 1997/98, however, some did not during the 1997 /98 to 2005 period. The other members of the family were less fortunate: even if the participation rate also increased dramatically (IO percentage points), their unemployment rate remained nearly unchanged, doubling spouses unemployment during the last period.


**Table 3.9 Labour Status and Education2 Paraguay, Selected Years** 

Source: Author's calculations based on EPH and EIH of National University and DGEEC.

Table 3.9 presents the proportion of employed, unemployed and inactive individuals by educational group. In the 1992 to 1997 /98 period, data show a strong increase in employment rates, jointly with a decrease in inactivity rate, for all Thomas Otter - 978-3-631-75367-5

educational levels but college complete group. Employment increases and inactivity decreases are stronger for higher educational levels. This should imply an increase in inequality. In the 1997 /98 to 2005 period, employment only keeps growing (and inactivity shrinking) for primary education and for complete university education. Overall, for this period we should expect an equalizing effect on income inequality. Over the whole period the unequalizing effect of the first period is expected to be stronger than its compensation in the next period.

### 3.3.7 **Education**

In Paraguay, as in many developing countries, substantial changes in the educational composition of the population have been taking place during the nineties. Table 3.10 presents the proportion of individuals between 12 and 64 years of age by educational level. Between 1992 and 1997 /98 there was a contraction in the proportion of youngsters and adults with incomplete primary education and an expansion for incomplete secondary education. Both are groups with low or up to medium wages. In the 1997 /98 to 2005 period the participation of incomplete primary education kept falling, primary complete remained almost unchanged and larger changes were observed in the higher income groups with secondary and college education.


**Table 3.10 Composition of Sample by Educational Level in Paraguay, Selected Years** 

Note: Data cover individuals between 12 and 64 with valid answers.

Source: Author's calculations based on EPH and EIH of National University and DGEEC.

Education is usually viewed as an equalizing force. The traditional argument points out that income disparities in one generation can be reduced in the next, if poor children have access to more and better education, so that the educational gap with rich-families' children decreases. However, following Kuznets (1955), one can tell a different story if the high-educated rich are a minority and only some poor children manage to make it all the way up to the highest educational (and income) levels. In that case, it is likely that inequality grows as the average education of the population increases; at least until the high education group is relatively large. With multiple educational levels, a similar unequalizing outcome emerges if there is a net outflow from the lowest educational levels and a similar net inflow to the highest levels, with minor changes in the intermediate Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM

levels. Changes in the educational structure from 1997 /98 to 2005 have more or less taken this form, which feeds the assumption of an unequalizing education effect for this period and a more equalizing effect for the 1992 to 1997 /98 period. Between 1992 y 1997/98, 5% of working age population left the primary education group. Almost all of these entered the secondary education group. In the 1997/98 to 2005 period, 12% left primary education. Seven percent entered in secondary education but the other 5% passed on to college level.

So far we have analyzed several factors that might have affected inequality. Although we have offered some evidence to argue about each effect, we still do not have a consistent framework to confirm the sign of each effect and where to assess its quantitative relevance. Were changes in the returns to education really an unequalizing force? Were they really a significant force? What about gender, employment or education effects? The next section presents a framework to tackle these issues.

### **3.4 Estimation strategy**

To compute expressions (3.14) and (3.19) in section two, we need to have estimates of parameters *{3* and A and the residual terms *e.* Also, since we do not have panels, we need a mechanism to assign observable and unobservable individual characteristics in period *t'* to individuals in *t.* This section is dedicated to explain the strategies to deal with these problems.

### *Estimation of {3 and* A

Let *Li* denote the number of hours worked by person i, and with wi the hourly wage perceived. Total labour income is given by *Yi* = *Li. wi.* The number of hours of work *Li* comes from a utility maximization process which determines optimal participation in the labour market, whereas wages are determined by market forces. The estimation stage specifies models for wages and hours of work which are used in the simulation stage described above.

The econometric specification of the model is similar to the one used by Bourguignon Fournier and Gurgand (200 I), which corresponds to the reduced form of the labour decisions model originally proposed by Heckman (1974). Heckman shows how it is possible to derive an estimable reduced form starting from a structural system obtained from a utility maximization problem of labour - consumption decisions. Leaving technical details aside, the scheme proposed by Heckman has the following structure. Individuals allocate hours to work and domestic activities ( or leisure) so as to maximize their utility subject to time, wealth, wages and other constraints. As usual, the solution to this optimization problem can be characterized as demand relations Thomas Otter - 978-3-631-75367-5 for goods and leisure as func-

tions of the relevant prices. Under general conditions it is possible to invert these functions to obtain prices and wages as functions of quantities of goods and leisure consumed (or its counterpart, hours of work). In particular, the wages obtained in this manner (denoted as w\*) are to be interpreted as marginal valuations of labour, which will be a function of hours of work and other personal characteristics, and represent the minimum wage for which the individual would accept to work a determined number of hours. In equilibrium, if the individual decides to work, the number of hours devoted to labour should equate their marginal value w\* with the wage effectively perceived. On the contrary, if the individual decides not to work, it is because this marginal value is greater than the wage offered, given the individual's personal characteristics.

This discussion suggests how to determine wages as of which individuals are willing to work. On the same note, it is possible to model market determinants of wages offered (w) as a function of characteristics such as years of education, experience and age as a standard Mincer equation (Mincer, 1974). In equilibrium it is assumed that the number of hours of work adjusts to make w=w\*.

The demand-supply relations discussed so far are structural forms in the sense that they reflect relevant economic behaviour in which wages offered and asked depend on the number of hours of work, which equate in equilibrium. Under general conditions it is possible to derive a reduced form for the equilibrium relations, in which wages and hours of work are expressed as functions of the variables taken as exogenous. In this way, the model has two equations, one for wages (w\*) and one for the number of hours of work *(L* \*), both as function of factors taken as given which affect wages *(XI)* and hours *(X2)* which may or may not have elements in common. The error terms *el* and *e2* will represent non-observable factors affecting the determination of endogenous variables. According to the characteristics of the problem, for a particular individual we observe positive values of w\* and *L* \* if and only if the individual actually works. If the person does not work, we only know that the wage offered is less than the salary asked. Consequently, the reduced form model for wages and hours of work is specified as:

$$\begin{array}{ll}\text{wi\*} = X1 \text{i} \text{i} + e \text{i} \text{ i} & i = 1, \ldots, N\\ \text{i} \text{i\*} = X2 \text{i} \text{i} + e \text{2} \text{i} \end{array} \tag{3.20}$$

wi = wi\* if *Li\*>* 0 wi = 0 if *Li\** • <sup>0</sup> *Li= Li\** if *Li\*>* 0

with

$$\begin{array}{c} \mathbf{\color{red}{\underset{\text{\raisebox{red}{\underset{\text{\raisebox{red}{\underset{\text{\raisebox{red}{\text{\color{blue}}{\rightleftharpoarrow}}}}}}}}}} \mathbf{\color{red}{\underset{\text{\color{red}{\rightleftharpoarrow}}}}} \mathbf{\color{red}{\underset{\text{\color{red}{\rightleftharpoarrow}}}}} \mathbf{\color{red}{\mathbf{0}}} \\ \mathbf{\color{red}{\underset{\text{\color{red}{\underset{\text{\color{red}}{\rightleftharpoarrow}}}}}}} \mathbf{\color{red}{\underset{\text{\color{red}{\rightleftharpoarrow}}}}} \mathbf{\color{red}{\mathbf{0}}} \\ \mathbf{\color{red}{\underset{\text{\color{red}{\rightleftharpoons}}}}} \mathbf{\color{red}{\mathbf{0}}} \mathbf{\color{red}{\mathbf{0}}} \end{array}$$

where *wi* and *Li* correspond to observed wages and hours of work, respectively. This notation emphasizes that, consistently with the data used for the estimation, observed wages for a non-working individual are zero.

Following Heckman (1979), for estimation purposes we will assume that *ei1* and *ei2* have a bivariate normal distribution with *E(eli)* = *E(e2;)* = 0, variances s <sup>12</sup>and s22 and correlation coefficient *r.* This particular specification corresponds to the "Tobit type III" model in Amerniya's (1985) classification.

Although it is possible to estimate all the parameters using a full information maximum likelihood method, the implemented methodology adopted a limited information approach, which has notorious computational advantages. If instead of hours of work we only had information about whether the individual works or not, the model would correspond to the "Type II" model in Amerniya's classification, whose parameters can be estimated based on a simple selectivity model. More specifically, the regression equation would be the wage equation and the selection equation would be a censored version of the labour supply equation, simply indicating whether the individual works or not. Table 3.3 shows the estimation results of these equations.

On the other hand, the hours of work equation corresponds to the "Tobit type I" model in Amerniya's classification where the variable is observed only if it is positive. In this case, the parameters of interest could be estimated using a standard censored regression Tobit model. This strategy is consistent though not fully efficient. In any case, the efficiency loss is not necessarily significant for a small sample. The results of the estimation are shown in Table 3.7.

### *Unobservable Factors*

Unobservable characteristics affecting wages are modelled as regression error terms of the wage equation (3.20). Their mean is trivially normalized to zero and their variance is estimated as an extra parameter in the Heckman procedure. In order to simulate the effect of changes in those unobservable factors between *t* to *t'* on inequality, the estimated residuals of the wage equation of year *t* are rescaled by *st '/st,* where *s* is the estimated standard deviation of the wage equation. This captures the effect of differences between years in dispersion in the unobservable factor affecting wages, which include non-observable factors and their market value.35

<sup>35</sup> It is important to mention that under bivariate normal assumption implicit in the Heckman model, once the correlation between unobservables affecting wages and hours worked is kept constant, all remaining effects on unobservables on wages come through the variance. Machado and Mata (1998) allow for heterogeneous behavior of the error term using quantile regression methods. Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM

To study employment effects the decomposition methodology requires simulating earnings for people who do not work. Since we do not observe wages we cannot apply equations (3.20) and (3.21) to estimate the unobservable factors. For each individual in that situation, we assign as "error term" a random draw from the bivariate normal distribution implicit in the wage - labour supply model (3.20) and (3.21), whose parameters are consistently estimated by the Heckman procedure. Residuals are sampled from the distribution of unobservable factors but conditional to the fact that the behaviour of the individual is observed. That is, error terms are drawn from the bivariate normal distribution and a prediction (based on observable characteristics, estimated parameters and sampled errors) is computed for wages and hours worked. If the resulting prediction yields positive hours worked (so the prediction is inconsistent with the observed behaviour in this group), the error term is sampled again until non-positive hours of work are predicted.

### *Individual characteristics*

For the estimation of the education effect it is necessary to simulate the educational structure of year *(* on year *t* population since we do not have the same individuals in both years. Instead of following Bourguignon, Fournier and Gurgand (2001) and estimating a parametric equation that relates individual educational level to other individual characteristics (age and gender), a rough nonparametric mechanism was applied. Adult population was divided in ten homogeneous groups by gender and age and then the educational structure of a given cell in year *t'* was replicated into the corresponding cell in year *t.* 

### *Poverty*

Poverty, measured officially by income, summing all kind of income, decreased in Paraguay between 1992 and 1997 /98, increased until 2002 and turned back to a slight reduction as from 2003. Poverty and inequality are not the same, but they are closely related. A higher level of inequality reduces poverty reduction driven by economic growth, since the additional income and benefits from growth are not equally distributed amongst the population. Inequality reduction is not a poverty reduction tool in itself, but it can improve performance and impact of poverty reduction processes. So, it would be interesting to figure out which could be the effects the simulated inequality changes on poverty levels. Since the simulations of changes of inequality are based on labour income, we need a labour income poverty line. There is no such line officially fixed for Paraguay. As a proxy, the mean equivalent household labour income of all households classified as poor by the official per capita income poverty line, was taken as an income poverty line to check on poverty reduction effects of simulated inequality changes. Since this is only a very rough proxy, measurement results should not be taken as real changes Thomas Otter - 978-3-631-75367-5 in poverty, related to inequality changes, because there are other kinds of income and many more factors related to poverty change as a whole. Anyway, the simulated poverty changes could be understood as a proxy for the sign of poverty reduction impact of an observed change in inequality, and to identify which kind of inequality change would have a stronger effect on poverty and which would not.

### **3.5 Results**

This section reports the results of performing the decomposition described in Section 2 using the estimation strategy outlined in Section 4. The objective is to shed light over the quantitative relevance of the various phenomena discussed in Section 3 on inequality changes during the 1992 - 2005 period.

Before showing the results two points must be clarified. First, the decompositions are path dependent. Hence, results are reported using alternatively *t* and *t'*  as the base year. Second, the simulations are carried out for the whole distribution.

Tables 3 .11 to 3 .13 show the results both with *t* and *t'* as base years. Table 3 .14 reports the average of these results. A positive number indicates an unequalizing effect. A large number compared to the other figures in the column suggests a significant effect. For instance, the returns-to-education effect on the individual earnings distribution in the 1992 to 1997/98 period is -2.1. This roughly means that the Gini would have decreased -2.1 points, if only the returns to education *(i.e.* the coefficients of the educational dummies in the wage equation) had changed between those years. The number -2.1 tells us two things: (i) since it is a negative number, it implies that the returns-to-education effect was inequalitydecreasing, and (ii) since it is large compared to the other numbers in the column, it indicates that the change in returns to education was economically a significant factor affecting inequality. Its effect on equivalent household income distribution is also inequality decreasing, but to a minor degree. Nevertheless, returns to education seem to have a poverty increasing effect. The story here is that changes in return to education are related to income losses, which certainly make income distribution more equal. At the same time, however, lower income groups suffer stronger losses, so some of them fall below the poverty line.


#### **Table 3.11 Decomposition of the Change in Gini coefficient for Earnings and Equivalent Household Labour Income and Equivalent Labour Household Income Poverty, Paraeuay 1992-1997/98**

Note: The eammgs d1stnbut1on mcludes those individuals with Yit > 0 and Yit(kt') > 0 The equivaleli't household labour income distribution includes those individuals with v\1 >= O and Y it(kt') >= 0. Non-labour income is not considered. 

Source: Author's calculations based on EPH and EIH of National University and DGEEC. Thomas Otter - 978-3-631-75367-5

Downloaded from PubFactory at 01/11/2019 05:51:05AM via free access



.. Note: The eammgs d1stnbut1on includes those md1V1duals with Ytt > 0 and Y1t(kt') > 04 The equ1vale~t household labour income distribution includes those individuals with Y it >= 0 and Y it(kt') >= 0. Non-labour income is not considered. 

Source: Author's calculations based on EPH and EIH of National University and DGEEC. Thomas Otter - 978-3-631-75367-5


### **Table 3.13 Decomposition of the Change in Gini coefficient for Earnings and Equivalent Household Labour Income and Equivalent Labour Household Income Poverty, Paraguay 1992 - 2005**

Note: The earnings distribution includes those individuals with Yit > 0 and Yit(kt') > 04 The equivaleljlt household labour income distribution includes those individuals with Y it >= 0 and Y it(kt') >= 0. Non-labour income is not considered.

Other factors -0.2 -0.5 -0.5

Source: Author's calculations based on EPH and EIH Thomas Otter - 978-3-631-75367-5 of National University and DGEEC. Downloaded from PubFactory at 01/11/2019 05:51:05AM via free access


**Table 3.14 Decomposition of the Change in the Gini Coefficient and Equivalent Household Income Poverty Rates Changing the Base Year, Paraguay, Selected Periods** 

Note: The eammgs distnbutton includes those individuals with *Yil* > 0 and *Yit(kt')* > 0~ The equivale!jlt household labour income distribution includes those individuals with Y it >= 0 and Y it(kt') >= 0. Non-labour income is not considered.

Source: Author's calculations based on EPH and EIH of National University and DGEEC.

### *Returns to education*

Table 3.14 confirms the assumption of Section 3. Changes in the returns to education had an unequalizing effect on the individual earnings distribution between 1992 and 1997/98 and a strong equalizing effect in the next seven years. The effects on the equivalent income distribution were similar. Over the whole period 1992-2005, changes in the returns to education (in terms of hourly wages) represented an important inequality-decreasing factor.

### *Gender wage gap*

As expected, changes in the gender parameter of the wage equation implied an equalizing effect on the individual earnings distribution. During the last decade the gender gap has substantially reduced in size. Given that women earn less than men, that movement had an unambiguous inequality - decreasing effect on the earnings distribution. It is interesting to notice that the gender effect becomes more important in the equivalent household labour income distribution. Two factors combine to generate this result. First, female workers are more concentrated in the lower part of the distribution than men (mainly in rural area) and therefore a relative wage change implies a decrease in household income inequality. Second, a proportional wage increase for all females is more relevant in low-income families since women's earnings are a more significant part of the Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM

via free access

total resources in those households than in rich families. A great example is the disproportionate number of poor households headed by working women. Consequently, its effect on poverty is a reduction.

### *Returns to experience*

Changes in the returns to experience (age) implied an unequalizing effect on the earnings distribution during the period 1992-1997 /98 and an equalizing effect in the next seven years. A brief explanation follows. The effects of changes in the returns to experience did not have direction and impact for all education levels. There was a clear equalizing effect for low education levels, unclear effects for medium education levels and an inequalizing effect for high educational levels. The empirical evidence shows that the increase of inequality in higher education levels outpaced equalizing effects for lower levels. Nevertheless, the equalizing effects for lower education levels do exist and are associated to higher incomes, so that poverty tends to reduce. Between 1997 /98 and 2005 there is a more generalized equalizing effect, however, associated with income losses. So inequality decreased meanwhile poverty increased. Results for equivalent household income show the same patterns.

### *Unobservable Factors*

Changes in endowments and returns to unobservable factors have implied unequalizing changes in wages in the 1992 - 1997 /98 period, associated with poverty reduction and opposite effects over the next seven years, for both, earnings and equivalent household income.

### *Hours of Work*

We carried out three simulations to assess the relevance of employment changes on inequality. In all of these the distribution in the base year is simulated using the parameters of the Tobit employment equation of the other year. In the *employment* and *participation* effects, people with non-positive simulated hours of work are assigned zero earnings, so they remain included in the data set. People who work in the simulation are assigned the actual base year wage and the simulated worked hours in the *employment* effect and the actual worked hours in the *participation* effect. The third simulation is intended to single out the impact of changes in hours worked. People who change labour status *(i.e.* we kept their current earnings) and change hours of work to individuals who work both in the base year and in the simulation, were ignored.

An equalizing employment effect shows up in the individual earnings and equivalent household income distribution for the whole period. Nevertheless, in the first period it is associated with poverty reduction (income increase for lower earnings) and with a poverty increase in the second period. Notice that since we exclude those individuals with zero earnings from that distribution, the employment effect is basically the result of relative changes in the number of hours of work. The figures for the hours-of-work and participation effects confirm this assertion. As discussed in Section 3, the nineties witnessed a substantial increase in hours of work in general.

### *Employment*

Labour participation grew fast after 1992, a period of economic growth and creation of new and additional labour opportunities. Consequently, inequality and poverty decreased during the first period, for earnings and equivalent household income. Part of these gains were lost after 1998, when the economy entered a period of recession. Middle classes rank first regarding income losses, so inequality and poverty increased after 1998, for both, earnings and equivalent household income. Nevertheless, losses of the second period were not as strong as the gains of the first, so the overall effect was an inequality and poverty reduction.

### *Education*

Paraguay has witnessed important changes in the educational composition of its population since the implementation of educational reform was started in 1994. An inequality and poverty reduction for earnings and households, together with poverty reduction was the result for the first period. In the second period earnings distribution keeps getting better for labour income, but equivalent household income distribution unequalizes and poverty increases. This might be the result of the increase in workers with university level education at the same time as income losses for primary education.

### *Other factors and Interactions*

The last row in Table 3.14 is calculated as a residual. It encompasses the effects of interaction terms and many factors not considered in the analysis. According to Table 3 .14, in general, this term is lower than the mean of the other terms in the decomposition, implying either that the factors not considered in the analysis are not extremely important or that they tend to compensate each other.

### **3.6 Discussion**

The results of the paper suggest that the smaller change in inequality between 1992 and 1997/98 is mainly the result of employment (including hours of work) and education effects, characterized by a primary schooling expansion. The stronger inequality reduction effect after 1997 is due to returns to education, hours of work (since unemployment increased) Thomas Otter - 978-3-631-75367-5 and unobservable factors. Maybe

the most interesting finding of the paper is, that the general trend of an inequality decrease (interrupted by the 2000 - 2002 economic crises) held over the observation period, finding a way to reduce inequality even during periods of poverty increase. Comparing the post-2002 period with 1997/98, we can see that in 2005, even if poverty was higher, inequality was lower. As shown above, labour market conditions in a mix-up of participation rates, unemployment, hours of work and returns to education are the mechanisms which helped to decrease inequality, as well as unobservable factors in the 1997/98 to 2005 period, however, poverty increased at the same time. Labour income in 2005, in general, was lower than in 1992 and income was lost over the whole distribution and, in a higher level for higher income groups, this is why inequality decreased. However, since income also decreased for the poor, some former non-poor workers of households fell below the poverty line, and are now what are known as "newpoor" households. Good inequality reduction policies should search the opposite output, inequality and poverty reduction at the same time.

One of the surprising findings of this paper is the extremely high returns to education in 1997/98. Education reform started in 1994 with primary education, so education reform results could not yet have had impact labour market in 1997/98. However, the decrease of returns to education after 1998 can be observed in relation to education reform, at least for secondary education. As labour force increases its human capital at a massive rate, returns to education tend to decrease. We checked for returns to education in 1999 and 2000/01 surveys. In both cases, returns to education are surprisingly high, although slightly lower than 1997/98 results. Consequently, there seems to be no measurement error. Returns to education fall sharply in 2002, just at the time of a deepening in the economic crisis. Thus, decreasing returns to education seem to be a mix of lower remuneration levels in all the economy and the results of education reform.

Changes of inequality at the equivalent household income level are difficult to understand. Nevertheless, they are included in this paper just to show that even if inequality changes related, for instance to labour participation, could be important at an individual level, their impact at the household level does not necessarily have to be the same. Interestingly, gender wage gap reductions have a strong poverty reduction impact at a household level. The chain of effects seems to be that the additional income for women, leaving everything else constant, also benefits female headed households, most of which live below the poverty line.

The same factor which explained inequality changes, employment, hours of work and education factors have the main impact on changes on income poverty

levels, as should be expected. Once more, the signs and the "rank" of these poverty estimates should be considered and not necessarily the magnitude of simulated changes in poverty, since their estimation method was not very sophisticated.

### 3.7 **Conclusions**

The decomposition methodology used in this paper can describe more completely the reasons for changes in aggregate income inequality within particular economies. A country may experience relatively little change in the overall level of income inequality despite significant changes in the composition of that inequality. Analyzing several countries during the same period of time, using this kind of inequality decomposition will produce more detailed results than cross country comparisons and might show how despite similar economic crisis and common trends in a given region and period of time, overall levels of and changes in income inequality remain distinct by country (Bourguignon, Ferreira, Lustig, 2005). Dion (2007) concludes from their comparison of several countries show that it seems likely that differences in inequality outcomes may reflect differences not only in endowments, prices and occupation effects, but also in policy decisions and priorities of different governments.

This paper contributes to an upcoming political discussion in Paraguayan development politics, which are starting to shift their focus somewhat away from poverty reduction politics towards inequality reduction politics, now understanding poverty in part, as a consequence of inequality. This contribution is appreciated by showing the results of a microeconometric decompositions methodology. This technique allows the assessment of the relevance of various factors that affected inequality during a period of 13 years, between 1992 and 2005.

This paper is not on Paraguayan poverty or inequality reduction politics. Nevertheless, some concluding comments on these can help to better understand the acquired results. The story we can tell, knowing Paraguayan politics, is that the impacts of market forces and business cycles have a stronger impact on Paraguayan inequality and poverty, than special policies do. This is, on the one hand, because there are very few of such policies and, on the other hand, most of these lack scale, so even if the political concept is adequate, impact cannot be created for problems of scale. Educational politics is one of the exceptions.

There are also structural problems in Paraguayan economy, with an informal sector of about 50% of the labour force, so any initiative taken by the government, for instance on legal minimum wage, will not have any impact on half of the labour force. These kinds of problems are Thomas Otter - 978-3-631-75367-5 strongly limiting possibilities for

policies impact; just an example to better understand the importance of these kinds of phenomenon. In 2005, only 10% of the labour force had a labour income equal or above the legal minimum wage.

Consequently, if regional market forces and business cycles tell almost the whole story of inequality and poverty changes, we should better understand how this works. Labour income in Paraguay is much more than monetary income. It includes monetarized values for self consumption of agricultural products cultivated by farmers. In 1997, agricultural GDP growth was 2.2 times bigger than overall GDP growth. More than 35% of the labour force works in the agricultural sector. Cultivating land is almost the same amount of work (in hours) year after year, but if the harvest is good and prices are even better, for a small period of years, the returns to education (even for low educated small farmers) will be high for these years. Returns to education level benefit from an open economy in "good times", but in "bad times" external shocks such as the Brazilian devaluation and the Argentine default strike even harder.

Thus, if social politics are necessary to reduce inequality, but economic and market forces are stronger in their negative impacts than positive impacts that could be generated by social politics, maybe protection mechanisms for vulnerable groups would be the necessary complement to social politics and research should focus on these issues.

### **A - Annex to Chapter 1 Figure Al Structured error per capita income estimates 1992 at district level**

Source: Author's calculations based on ECV 1992 and CNPV 1992

### **Figure A2 Unstructured error per capita income estimates 1992 at district level**

Source: Author's calculations based on ECV 1992 and CNPV 1992

### **Figure AJ Structured error per capita income estimates 2002 at district level**

Source: Author's calculations based on EPH 2002 and CNPV 2002

### **Figure A4 Unstructured error per capita income estimates 2002 at district level**

Source: Author's calculations based on EPH 2002 and CNPV 2002 Thomas Otter - 978-3-631-75367-5 Downloaded from PubFactory at 01/11/2019 05:51:05AM via free access

### **Figure AS Relative change in FGT0 per capita income- period 1992- 2002 at district level**

Source: Author's calculations based on ECV 1992, CNPV 1992, EPH 2002 and CNPV 2002

### **Figure A6 Relative change in Gini per capita income - period 1992 - 2002 at district level**

Source: Author's calculations based on ECV 1992, Thomas Otter - 978-3-631-75367-5 CNPV 1992, EPH 2002 and CNPV 2002

Downloaded from PubFactory at 01/11/2019 05:51:05AM via free access

### **B** - **Annex to Chapter 2**

Source: Author's calculations based on results in Chapter 1

# **Bibliography**


### **Gottlnger Studlen zur Entwlcklungsokonomlk Gottlngen Studies In Development Economics**

### Herausgegeben von/Edited by Hermann Sautter und/and Stephan Klasen

Die Biinde 1-8 sind iiber die Vervuert Verlagsgesellschafl (Frankfurt/M.) zu beziehen.


www.peterlang.de